How to know if a bot claimed to be Googlebot a real Googlebot? Verify Googlebot

The answer is running Google DNS reverse lookup.

Let's see some straight forward examples:

Example 1:

> host 66.249.66.1
1.66.249.66.in-addr.arpa domain name pointer crawl-66-249-66-1.googlebot.com.

> host crawl-66-249-66-1.googlebot.com
crawl-66-249-66-1.googlebot.com has address 66.249.66.1

Example 2:

> host 66.249.90.77
77.90.249.66.in-addr.arpa domain name pointer rate-limited-proxy-66-249-90-77.google.com.

> host rate-limited-proxy-66-249-90-77.google.com
rate-limited-proxy-66-249-90-77.google.com has address 66.249.90.77

Reference: //support.google.com/webmasters/answer/80553?hl=en

Using linux find command – find file name contains certain pattern and change time – send to exec command for further process

This is a good example of combining find command with other commands
Linux find command: find files and send to post prosessing.

If you are dealing with files, you might wonder what the difference is between mtime, ctime and atime.

mtime, or modification time, is when the file was last modified. When you change the contents of a file, its mtime changes.

ctime, or change time, is when the file’s property changes. It will always be changed when the mtime changes, but also when you change the file’s permissions, name or location.

atime, or access time, is updated when the file’s contents are read by an application or a command such as grep or cat.

The easiest way to remember which is which is to read their alphabetical order:

  • Atime can be updated alone

  • Ctime will update atime

  • Mtime will update both atime and ctime.

find /myDir -name 'log*' -and -not -name '*.bz2' -ctime +7 -exec bzip2 -zv {} \;
# find file name has "log" and has no "bz2", with a change day longer than 7 days - send it to exec command
- {} presents the result found
References:
Linux find command - MUST KNOW:

Example:

find /dir -cmin -60 # creation time
find /dir -mmin -60 # modification time
find /dir -amin -60 # access time
 Numeric arguments can be specified as
   +n     for greater than n,
   -n     for less than n,
   n      for exactly n.
   -amin n
          File was last accessed n minutes ago.
   -anewer file
          File was last accessed more recently than file was modified.  If file is a symbolic link and the -H option or the -L option is in effect, the access time of the file it points  to  is  always
          used.
   -atime n
          File  was  last  accessed  n*24 hours ago.  When find figures out how many 24-hour periods ago the file was last accessed, any fractional part is ignored, so to match -atime +1, a file has to
          have been accessed at least two days ago.
   -cmin n
          File's status was last changed n minutes ago.
   -cnewer file
          File's status was last changed more recently than file was modified.  If file is a symbolic link and the -H option or the -L option is in effect, the status-change time of the file it  points
          to is always used.
   -ctime n
          File's status was last changed n*24 hours ago.  See the comments for -atime to understand how rounding affects the interpretation of file status change times.

Fix fatal error for inlcude path and included file not existed using include or require for cron job

When running php in cron, the include and require path may not work.

Eg. when your file has below require or include statement:

require '../includes/common.php';

You will get a fatal error when running using php-cli or php-cli in cron

# php-cli
php /home/username123/public_html/cron/mycronjob.php
# cron
* * * * * php /home/username123/public_html/cron/mycronjob.php

Error got:

Fatal error: require(): Failed opening required '../includes/common.php'
 (include_path='.:/usr/lib/php:/usr/local/lib/php') in
 /home/username123/public_html/cron/mycronjob.php on line 2

To fix this, you need to use php to set include_path attributes so when running cron it'll set the path so the script will learn where to look for the include / require files.

set_include_path

reference: http://php.net/manual/en/function.set-include-path.php

example:

set_include_path('/home/username123/public_html/includes/');
require 'common.php';

My experiences of changing php cron job running by wget to php-cli /usr/bin/php

tool used to convert the cron interval to human readable time: https://crontab.guru/

This is one of my real world experiences.

Lately we migrated some websites with the entire WHM and all the cron jobs running as is.

There were about 12 cron jobs running, something like below:

59 23 6 /usr/bin/wget -O /dev/null "http://www.somedomain.net.au/some-api-data.php" >/dev/null 2>&1

However because of the domain and IP routing changes, we could no longer run wget for any of the php files under these domains.

So my alternative is to move these script running via wget to internally running via php-cli.

Below is the approach:

Firstly created a test cron.

SSH as root, put a small php script to output the time stamp and run as cronjob every minute as below. ( crontab -l will show this line at the bottom)

          • /usr/bin/php /home/website/public_html/date_cron_test.php >> /usr/logs/date_cron_test.log 2> /dev/null
            Successfully generates the log file and had correct record for outputs of each round.

            Then change all the cron jobs like below

            59 23 6 /usr/bin/wget -O /dev/null "http://www.somedomain.net.au/some-api-data.php" >/dev/null 2>&1

to

59 23 6 /usr/bin/php /home/somedomain/public_html/some-api-data.php >> /usr/logs/some-api-data.log 2> /dev/null

Change all the old wget ones to the /usr/bin/php php-cli ones accordingly.

Also the php needs to be set mode with x as executable, eg 755, this depends on your actual server settings.

Finally run service crond reload

It's all set up. All we need to do is checking the cron and php log later to verify.

 

References:

  • Cron job executing php trouble shooting: http://stackoverflow.com/questions/7397469/why-is-crontab-not-executing-my-php-script
  • crontab php file and output to log file result (append mode): http://stackoverflow.com/questions/9456424/crontab-php-file-and-output-to-log-file-result
  • Meanings of 2>&-, 2>/dev/null, |&, &>/dev/null and >/dev/null 2>&1: http://unix.stackexchange.com/questions/70963/difference-between-2-2-dev-null-dev-null-and-dev-null-21
  • cron "BAD FILE MODE": https://www.redhat.com/archives/rhl-list/2005-February/msg02458.html

Manually edit vhost file on WHM-Cpanel server

Manually edit vhost file on WHM-Cpanel server

If you need to manually edit the vhost file ona server that is running WHM and cPanel then you need to edit the include files which are used to build the vhost file. The include files for each account are held in the following location:

/var/cpanel/userdata/*accountname*

The main file would be

/var/cpanel/userdata/*accountname*/yourdomain.com

Parked domains and subdomains are held in the "main" file in the same folder. For an account, "yourdomain" and the URL yourdomain,com you would edit:

/var/cpanel/userdata/yourdomain/yourdomain.com

After editing the file you need to run the following three commands to rebuild the vhost file.

/usr/local/cpanel/bin/apache_conf_distiller --update
 /scripts/rebuildhttpdconf
 /etc/init.d/httpd restart

Linux find command find all symlinks in an folder

find . -type l -ls

Expanding upon this answer, here are a couple more symbolic link related find commands:

Find symbolic links to a specific target

find . -lname link_target

Note that link_target may contain wildcard characters.

Find broken symbolic links

find -L . -type l -ls

The -L option instructs find to follow symbolic links, unless when broken.

Find & replace broken symbolic links

find -L . -type l -delete -exec ln -s new_target {} \;

More find examples

More find examples can be found here: http://hamwaves.com/find/

reference: http://stackoverflow.com/questions/8513133/how-do-i-find-all-of-the-symlinks-in-a-directory-tree

Add Multiple shared IPs in WHM manually

You can't add shared root IP from WHM but you can add it via ssh manually.

 

Step One: Create a /var/cpanel/mainips/ directory if there isn't any:

# mkdir /var/cpanel/mainips/

Step Two: Create a /var/cpanel/mainips/root file, with all the Ips as folllows

10.0.0.10
10.0.0.12

Add new line as additional shared IP in WHM

verify from WHM>>Home>>IP Functions>>Show/Edit Reserved IPs

 

source: https://syslint.com/blog/tutorial/how-to-configure-multiple-shared-ips-in-whm/

 

Find out which path phpmyadmin is installed on linux web server

You need to look in the configuration file to see where it is set up.

Something like this will find it

find /etc/httpd/ -print0 | xargs -0 grep phpmyadmin

Which will return something like this

/etc/httpd/conf.d/http.conf:    Alias /phpmyadmin /usr/local/share/phpmyadmin

Or look for the folder itself

locate phpmyadmin

or

find / -type d -name "phpmyadmin" -print

 

source: http://serverfault.com/questions/149541/where-is-phpmyadmin-installed-on-linux-web-server/573004

Get sizes of all databases in phpMyAdmin

If you logged in as root account you'll be able to see the default information_schema database, and the data was included in "TABLES" table.

In order to more readable stats, you can run the following command, this will output the formatted result:

SELECT table_schema "Data Base Name",
sum( data_length + index_length ) / 1024 / 1024 "Data Base Size in MB"
FROM information_schema.TABLES GROUP BY table_schema;

source:

http://www.uponmyshoulder.com/blog/2010/get-database-size-in-phpmyadmin/

use find command to locate dated modified files

find file in given date range

find . -type f -newermt "2016-12-10 00:00:00" ! -newermt "2017-02-25 18:50:00"

find file modified in the pass 75 days

find . -mtime -75

references:

http://unix.stackexchange.com/questions/29245/how-to-list-files-that-were-changed-in-a-certain-range-of-time

http://stackoverflow.com/questions/848293/shell-script-get-all-files-modified-after-date

http://askubuntu.com/questions/191044/how-to-find-files-between-two-dates-using-find

Linux file sync rsync usages including sync from remote server

Description

This article lists examples for secure copy (RSYNC) on Linux.

Local to Remote

File from local to remote.

rsync /some/local/directory/foobar.txt your_username@remotehost.edu:/some/remote/directory

All recursive contents in local directory to remote directory.

rsync --recursive /some/local/directory/ your_username@remotehost.edu:/some/remote/directory

All recursive contents in local directory to remote directory. Only copy if newer or missing. Delete remote that are not local. Keep owner, group and permissions the same. Keep executable flag the same.

rsync --verbose --recursive --update --archive --delete --executability /some/local/directory/ your_username@remotehost.edu:/some/remote/directory

Local to Local

All recursive content in source local directory to target local directory. Such as, and attached USB Storage device.

rsync -va /some/local/directory /some/other/local/directory
rsync -va /Users/jdoe /Volumes/usbdrive/Users/

Manual

NAME
       rsync — a fast, versatile, remote (and local) file-copying tool

SYNOPSIS
       Local:  rsync [OPTION...] SRC... [DEST]

       Access via remote shell:
         Pull: rsync [OPTION...] [USER@]HOST:SRC... [DEST]
         Push: rsync [OPTION...] SRC... [USER@]HOST:DEST

       Access via rsync daemon:
         Pull: rsync [OPTION...] [USER@]HOST::SRC... [DEST]
               rsync [OPTION...] rsync://[USER@]HOST[:PORT]/SRC... [DEST]
         Push: rsync [OPTION...] SRC... [USER@]HOST::DEST
               rsync [OPTION...] SRC... rsync://[USER@]HOST[:PORT]/DEST

       Usages with just one SRC arg and no DEST arg will list the source files instead of copying.

OPTIONS SUMMARY
       Here  is  a short summary of the options available in rsync. Please refer to the detailed description below for a
       complete description.

        -v, --verbose               increase verbosity
        -q, --quiet                 suppress non-error messages
            --no-motd               suppress daemon-mode MOTD (see caveat)
        -c, --checksum              skip based on checksum, not mod-time & size
        -a, --archive               archive mode; equals -rlptgoD (no -H,-A,-X)
            --no-OPTION             turn off an implied OPTION (e.g. --no-D)
        -r, --recursive             recurse into directories
        -R, --relative              use relative path names
            --no-implied-dirs       don’t send implied dirs with --relative
        -b, --backup                make backups (see --suffix & --backup-dir)
            --backup-dir=DIR        make backups into hierarchy based in DIR
            --suffix=SUFFIX         backup suffix (default ~ w/o --backup-dir)
        -u, --update                skip files that are newer on the receiver
            --inplace               update destination files in-place
            --append                append data onto shorter files
            --append-verify         --append w/old data in file checksum
        -d, --dirs                  transfer directories without recursing using
                                    "-r --exclude=’/*/*’" (rsync 2.6.x compatible)
            --new-dirs              transfer directories without recursing
                                    (rsync 3.0.x compatible)
        -l, --links                 copy symlinks as symlinks
        -L, --copy-links            transform symlink into referent file/dir
            --copy-unsafe-links     only "unsafe" symlinks are transformed
            --safe-links            ignore symlinks that point outside the tree
        -k, --copy-dirlinks         transform symlink to dir into referent dir
        -K, --keep-dirlinks         treat symlinked dir on receiver as dir
        -H, --hard-links            preserve hard links
        -p, --perms                 preserve permissions
        -E, --executability         preserve executability
            --chmod=CHMOD           affect file and/or directory permissions
        -A, --acls                  preserve ACLs (implies -p)
        -X, --xattrs                preserve extended attributes
        -o, --owner                 preserve owner (super-user only)
        -g, --group                 preserve group
            --devices               preserve device files (super-user only)
            --specials              preserve special files
        -D                          same as --devices --specials
        -t, --times                 preserve modification times
        -O, --omit-dir-times        omit directories from --times
            --super                 receiver attempts super-user activities
            --fake-super            store/recover privileged attrs using xattrs
        -S, --sparse                handle sparse files efficiently
        -n, --dry-run               perform a trial run with no changes made
        -W, --whole-file            copy files whole (w/o delta-xfer algorithm)
        -x, --one-file-system       don’t cross filesystem boundaries
        -B, --block-size=SIZE       force a fixed checksum block-size
        -e, --rsh=COMMAND           specify the remote shell to use
            --rsync-path=PROGRAM    specify the rsync to run on remote machine
            --existing              skip creating new files on receiver
            --ignore-existing       skip updating files that exist on receiver
            --remove-source-files   sender removes synchronized files (non-dir)
            --del                   an alias for --delete-during
            --delete                delete extraneous files from dest dirs
            --delete-before         receiver deletes before transfer (default)
            --delete-during         receiver deletes during xfer, not before
            --delete-delay          find deletions during, delete after
            --delete-after          receiver deletes after transfer, not before
            --delete-excluded       also delete excluded files from dest dirs
            --ignore-errors         delete even if there are I/O errors
            --force                 force deletion of dirs even if not empty
            --max-delete=NUM        don’t delete more than NUM files
            --max-size=SIZE         don’t transfer any file larger than SIZE
            --min-size=SIZE         don’t transfer any file smaller than SIZE
            --partial               keep partially transferred files
            --partial-dir=DIR       put a partially transferred file into DIR
            --delay-updates         put all updated files into place at end
        -m, --prune-empty-dirs      prune empty directory chains from file-list
            --numeric-ids           don’t map uid/gid values by user/group name
            --timeout=SECONDS       set I/O timeout in seconds
            --contimeout=SECONDS    set daemon connection timeout in seconds
        -I, --ignore-times          don’t skip files that match size and time
            --size-only             skip files that match in size
            --modify-window=NUM     compare mod-times with reduced accuracy
        -T, --temp-dir=DIR          create temporary files in directory DIR
        -y, --fuzzy                 find similar file for basis if no dest file
            --compare-dest=DIR      also compare received files relative to DIR
            --copy-dest=DIR         ... and include copies of unchanged files
            --link-dest=DIR         hardlink to files in DIR when unchanged
        -z, --compress              compress file data during the transfer
            --compress-level=NUM    explicitly set compression level
            --skip-compress=LIST    skip compressing files with suffix in LIST
        -C, --cvs-exclude           auto-ignore files in the same way CVS does
        -f, --filter=RULE           add a file-filtering RULE
        -F                          same as --filter=’dir-merge /.rsync-filter’
                                    repeated: --filter=’- .rsync-filter’
            --exclude=PATTERN       exclude files matching PATTERN
            --exclude-from=FILE     read exclude patterns from FILE
            --include=PATTERN       don’t exclude files matching PATTERN
            --include-from=FILE     read include patterns from FILE
            --files-from=FILE       read list of source-file names from FILE
        -0, --from0                 all *from/filter files are delimited by 0s
        -s, --protect-args          no space-splitting; wildcard chars only
            --address=ADDRESS       bind address for outgoing socket to daemon
            --port=PORT             specify double-colon alternate port number
            --sockopts=OPTIONS      specify custom TCP options
            --blocking-io           use blocking I/O for the remote shell
            --stats                 give some file-transfer stats
        -8, --8-bit-output          leave high-bit chars unescaped in output
        -h, --human-readable        output numbers in a human-readable format
            --progress              show progress during transfer
        -P                          same as --partial --progress
        -i, --itemize-changes       output a change-summary for all updates
            --out-format=FORMAT     output updates using the specified FORMAT
            --log-file=FILE         log what we’re doing to the specified FILE
            --log-file-format=FMT   log updates using the specified FMT
            --password-file=FILE    read daemon-access password from FILE
            --list-only             list the files instead of copying them
            --bwlimit=KBPS          limit I/O bandwidth; KBytes per second
            --write-batch=FILE      write a batched update to FILE
            --only-write-batch=FILE like --write-batch but w/o updating dest
            --read-batch=FILE       read a batched update from FILE
            --protocol=NUM          force an older protocol version to be used
            --iconv=CONVERT_SPEC    request charset conversion of filenames
            --checksum-seed=NUM     set block/file checksum seed (advanced)
        -4, --ipv4                  prefer IPv4
        -6, --ipv6                  prefer IPv6
            --version               print version number
       (-h) --help                  show this help (see below for -h comment)

       Rsync can also be run as a daemon, in which case the following options are accepted:

            --daemon                run as an rsync daemon
            --address=ADDRESS       bind to the specified address
            --bwlimit=KBPS          limit I/O bandwidth; KBytes per second
            --config=FILE           specify alternate rsyncd.conf file
            --no-detach             do not detach from the parent
            --port=PORT             listen on alternate port number
            --log-file=FILE         override the "log file" setting
            --log-file-format=FMT   override the "log format" setting
            --sockopts=OPTIONS      specify custom TCP options
        -v, --verbose               increase verbosity
        -4, --ipv4                  prefer IPv4
        -6, --ipv6                  prefer IPv6
        -h, --help                  show this help (if used after --daemon)

Source: https://ss64.com/bash/rsync.html

Examples:

Exclude directory

$ rsync -avz --exclude 'dir1' source/ destination/

Exclude directories that has a pattern

$ rsync -avz --exclude 'dir*' source/ destination/

Exclude a specified file

$ rsync -avz --exclude 'dir1/dir2/file3.txt' source/ destination/

Exclude path is relative as you can see the following two commands had the same effect

$ rsync -avz --exclude '/dir1/dir2/file3.txt' source/ destination/
$ rsync -avz --exclude 'dir1/dir2/file3.txt' source/ destination/

Exclude a file type specified

$ rsync -avz --exclude '*.txt' source/ destination/

Exclude multiple files and directories at the same time

$ rsync -avz --exclude file1.txt --exclude dir3/file4.txt source/ destination/

To use with a file list within a date range, first run find command to generate a file list

$ find /home/ -type f -newermt "2016-12-10 00:00:00" ! -newermt "2017-02-23 18:50:00" > filetosync.txt

You'll need to change the absolute path to relative path in this file. Then refer in the rsync command to run. vv will

$ sudo rsync -arogzlvv --files-from=filetosync.txt -e 'ssh -p 3636' root@domain.com:/home/ /home/ > rsync.liverun.log

Further reading:

The rsync algorithm

https://rsync.samba.org/tech_report/

Extras when added -i option to show results

Let's take a look at how rsync works and better understand the cryptic result lines:

1 - A huge advantage of rsync is that after an interruption the next time it continues smoothly.

The next rsync invocation will not transfer the files again, that it had already transferred, if they were not changed in the meantime. But it will start checking all the files again from the beginning to find out, as it is not aware that it had been interrupted.

2 - Each character is a code that can be translated if you read the section for -i, --itemize-changesin man rsync

Decoding your example log file from the question:

>f.st......

> - the item is received
f - it is a regular file
s - the file size is different
t - the time stamp is different

.d..t......

. - the item is not being updated (though it might have attributes 
    that are being modified)
d - it is a directory
t - the time stamp is different

>f+++++++++

> - the item is received
f - a regular file
+++++++++ - this is a newly created item

The relevant part of the rsync man page:

-i, --itemize-changes

Requests a simple itemized list of the changes that are being made to each file, including attribute changes. This is exactly the same as specifying --out-format='%i %n%L'. If you repeat the option, unchanged files will also be output, but only if the receiving rsync is at least version 2.6.7 (you can use -vv with older versions of rsync, but that also turns on the output of other verbose mes- sages).

The "%i" escape has a cryptic output that is 11 letters long. The general format is like the string YXcstpoguax, where Y is replaced by the type of update being done, X is replaced by the file-type, and the other letters represent attributes that may be output if they are being modified.

The update types that replace the Y are as follows:

  • A < means that a file is being transferred to the remote host (sent).
  • A > means that a file is being transferred to the local host (received).
  • A c means that a local change/creation is occurring for the item (such as the creation of a directory or the changing of a symlink, etc.).
  • A h means that the item is a hard link to another item (requires --hard-links).
  • A . means that the item is not being updated (though it might have attributes that are being modified).
  • A * means that the rest of the itemized-output area contains a message (e.g. "deleting").

The file-types that replace the X are: f for a file, a d for a directory, an L for a symlink, a D for a device, and a S for a special file (e.g. named sockets and fifos).

The other letters in the string above are the actual letters that will be output if the associated attribute for the item is being updated or a "." for no change. Three exceptions to this are: (1) a newly created item replaces each letter with a "+", (2) an identical item replaces the dots with spaces, and (3) an unknown attribute replaces each letter with a "?" (this can happen when talking to an older rsync).

The attribute that is associated with each letter is as follows:

  • A c means either that a regular file has a different checksum (requires --checksum) or that a symlink, device, or special file has a changed value. Note that if you are sending files to an rsync prior to 3.0.1, this change flag will be present only for checksum-differing regular files.
  • A s means the size of a regular file is different and will be updated by the file transfer.
  • A t means the modification time is different and is being updated to the sender’s value (requires --times). An alternate value of T means that the modification time will be set to the transfer time, which happens when a file/symlink/device is updated without --times and when a symlink is changed and the receiver can’t set its time. (Note: when using an rsync 3.0.0 client, you might see the s flag combined with t instead of the proper T flag for this time-setting failure.)
  • A p means the permissions are different and are being updated to the sender’s value (requires --perms).
  • An o means the owner is different and is being updated to the sender’s value (requires --owner and super-user privileges).
  • A g means the group is different and is being updated to the sender’s value (requires --group and the authority to set the group).
  • The u slot is reserved for future use.
  • The a means that the ACL information changed.
  • The x means that the extended attribute information changed.

One other output is possible: when deleting files, the "%i" will output the string "*deleting" for each item that is being removed (assuming that you are talking to a recent enough rsync that it logs deletions instead of outputting them as a verbose message).

Source: http://stackoverflow.com/questions/4493525/rsync-what-means-the-f-on-rsync-logs

Delete all Duplicate Rows except for One record in MySQL

NB - You need to do this first on a test copy of your table!

When I did it, I found that unless I also included AND n1.id <> n2.id, it deleted every row in the table.

1) If you want to keep the row with the lowest id value:

DELETE n1 FROM names n1, names n2 WHERE n1.id > n2.id AND n1.name = n2.name

2) If you want to keep the row with the highest id value:

DELETE n1 FROM names n1, names n2 WHERE n1.id < n2.id AND n1.name = n2.name

I used this method in MySQL 5.1

Not sure about other versions.

Update: Since people Googling for removing duplicates end up here
Although the OP's question is about DELETE, please be advised that using INSERT and DISTINCT is much faster. For a database with 8 million rows, the below query took 13 minutes, while using DELETE, it took more than 2 hours and yet didn't complete.

INSERT INTO tempTableName(cellId,attributeId,entityRowId,value)
 SELECT DISTINCT cellId,attributeId,entityRowId,value
 FROM tableName;

Reference:

http://stackoverflow.com/questions/4685173/delete-all-duplicate-rows-except-for-one-in-mysql

Additional information

DELETE 
 t1 
FROM 
 tTable t1, tTable t2 
WHERE 
 t1.fieldName = t2.fieldName AND t1.id > t2.id

ELSE

create another table as below

CREATE TABLE myTable_new (ID INT PRIMARY KEY, Title varchar(20))

and add values as

INSERT INTO myTable_new (ID, Title) SELECT ID, DISTINCT Title FROM old_table

considering old_table is the earlier table...

source: http://stackoverflow.com/questions/9452320/remove-duplicate-records-except-one-record

bash shell script to check all the cron jobs running for all cpanel users

vi /root/crondetails.sh

#!/bin/bash

cd /var/spool/cron

ls -1 > /cronusers

for i in ` cat /cronusers`

do

echo "#########################CRON DETAILS FOR USER $i " >> /crondetils
echo "" >> /crondetils
cat $i >> /crondetils
echo "" >> /crondetils
echo "####################################################" >>/crondetils
echo "" >> /crondetils

done