philtortoise System/DBA+: 2014

Monday, June 30, 2014

How long to complete this command ?

Running some heavy command line operations, you have usually no idea when it will complete...

Actually you can probably find out!... if you're on Linux and it is a file-related operations.

1) Find out how much is done

For example if I am running this restore:

$ zcat dump_MYDB_20140501_08h00.sql.gz | mysql MYDB

Under Linux I can actually find out how much of the dump file has been read, and therefore guess the speed.

Using 'top' or 'ps' I can easily find the PID of the process doing the read, here it is "zcat"

PID USER    PR NI VIRT RES SHR S %CPU %MEM    TIME+ COMMAND
20425 phil    26   0 4172 516 348 R 2.0 0.0   0:15.18 zcat
Then from /proc/<PID>/fd   one can read the current status of its file descriptors

$ ls -l /proc/20425/fd

total 0lrwx------ 1 phil g 64 May 7 14:17 0 -> /dev/pts/5
l-wx------ 1 phil g 64 May 7 14:17 1 -> pipe:[69451820]
lrwx------ 1 phil g 64 May 7 14:05 2 -> /dev/pts/5
lr-x------ 1 phil g 64 May 7 14:17 3 -> /var/dump_MYDB_20140501_08h00.sql.gz

I see that the file descriptor 3 is used to read the file. With recent Linux kernels I can also check the current position of each file descriptor, in the /proc/<PID>/fdinfo directory:

$ cat /proc/20425/fdinfo/3
pos:    353861632
flags: 0100000

I just have to compare to the full size of my dump file:

$ ls -l /var/lib/mysql/backups/dump_MYDB_20140501_08h00.sql.gz
-rw-r--r-- 1 phil g 686830066 May 6 13:00 /var/dump_MYDB_20140501_08h00.sql.gz

So at this point the process has read 353 861 632 bytes out of 686 830 066 , so we're roughly 50% done!

2) Compute an approximate speed, and finishing time

It is now easy to do several readings and approximate the speed:

$ cat fdinfo/3 ; sleep 10 ; cat fdinfo/3
pos: 372080640
flags: 0100000
pos: 447578112
flags: 0100000

$ echo $(( (447578112-372080640)/10 ))
7549747

I'm processing the file at 7MB/s (Yes pathetic I know... I'd love to investigate this damn SAN!)

(Of course it is an approximation because the processing speed may not be linear. In this very example the speed depends on the nature of the tables and indexes. Perhaps the end of the dump contain more complex ones.)

Since I still have to read the remainder of the file at this approximate speed, I can compute how long it would take.

$ echo $(( (686830066-447578112)/7549747 ))
31

...Well I shall be done in about 31 seconds!

OK back to work! Cheers!

Wednesday, May 7, 2014

trick: make several directories in one shot

Today I learned that you can nicely create several directories this way:

mkdir -p ~/rpmbuild/{BUILD,RPMS,SOURCES,SPECS,SRPMS}

$ find ~/rpmbuild
/home/pweltz/rpmbuild
/home/pweltz/rpmbuild/RPMS
/home/pweltz/rpmbuild/SRPMS
/home/pweltz/rpmbuild/BUILD
/home/pweltz/rpmbuild/SOURCES
/home/pweltz/rpmbuild/SPECS

This is nice.

It's now almost 20 years I started on Unix/Linux and I could have used this... You learn every day :-)

Friday, January 10, 2014

Mysql: (re)create a slave online faster using LVM snapshot and Netcat

Summary:

Flush tables & lock , create snapshot for datas AND logfiles

Prerequisites:

-create or replace /etc/my.cnf on slave machine
-make sure innodb_log_file_size is same on both sides

Flush, lock and create snapshot:

> FLUSH TABLES WITH READ LOCK;

> show master status;

+----------------------------+-----------+--------------+------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+----------------------------+-----------+--------------+------------------+
| mysql-toto50-bin.003912 | 653187465 | | |
+----------------------------+-----------+--------------+------------------+
(write this down)

> system sudo /usr/sbin/lvcreate -L 100G -n snap1 -s /dev/mysql_vg2/mysql_vg2-mysql_lv
(/dev/mysql_vg2/mysql_vg2-mysql_lv is the volume where I keep mysql data AND log )

> UNLOCK TABLES;
> quit;

Copy the snapshot via NETCAT

Netcat is much faster than FTP, SCP, SFTP, Samba, NFS...
Exception: If partial copy : use rsync
It has to be on a trusted network, but so shall be the link between slave and master!

# mount /dev/mysql_vg2/snap1 /var/mnt

SLAVE:
# mkdir /var/lib/mysql/new_data
# cd /var/lib/mysql/new_data && nc -l 64738 | tar -xvf -

(When I have enough room, I usually keep the old slave directory just in case)
(alternative: rsync -arv ... )

MASTER:
# cd /var/mnt && nice tar -cf - data | nice nc toto51 64738

Release the snapshot

I like to check the percentage used before:
# /usr/sbin/lvs
LV                 VG         Attr   LSize   Origin             Snap% Move
...
mysql_vg2-mysql_lv mysql_vg2 owi-ao   1.00T
snap1              mysql_vg2 swi-a- 150.00G mysql_vg2-mysql_lv   1.22

# umount /var/mnt
# /usr/sbin/lvremove /dev/mysql_vg2/snap1

Start the mysql slave instance:

#make sure the slave wont start yet
set: SKIP SLAVE START in my.cf

#move files in proper place (now I delete the old version)

#check permissions (chown, chmod, SElinux chattr... )

#start slave instance

service mysql start
or /etc/init.d/mysql start

#Define the master

> CHANGE MASTER TO MASTER_HOST='toto50',MASTER_USER='replicator', MASTER_PASSWORD='****', MASTER_LOG_FILE='mysql-toto50-bin.003912', MASTER_LOG_POS=653187465;

#start slave process

start slave

show slave status\G;

#do a test insert on master to check it is replicated...

#remove SKIP SLAVE START in case there is a reboot

Remarks:

With this method, the slave instance starts with a crash recovery
I sometimes notice this in the log:

140110 9:03:30 InnoDB: highest supported file format is Barracuda.
InnoDB: The log sequence number in ibdata files does not match
InnoDB: the log sequence number in the ib_logfiles!
140110 9:03:31 InnoDB: Database was not shut down normally!
InnoDB: Starting crash recovery.
InnoDB: Reading tablespace information from the .ibd files...
InnoDB: Restoring possible half-written data pages from the doublewrite
InnoDB: buffer...
InnoDB: Last MySQL binlog file position 0 653173962, file name /var/log/mysql/mysql-toto50-bin.003912
InnoDB: We removed now the InnoDB internal data dictionary entry
InnoDB: of table 140110 9:03:52 [Warning] Invalid (old?) table or database name '#sql1308_3a493_267e'
"tmp".<result 2 when explaining filename '#sql1308_3a493_267e'>.
InnoDB: We removed now the InnoDB internal data dictionary entry
InnoDB: of table 140110 9:03:52 [Warning] Invalid (old?) table or database name '#sql1308_3dbbf_1fb'
"tmp".<result 2 when explaining filename '#sql1308_3dbbf_1fb'>
....

These #sql... are temp tables so it's no big deal

Extra trick:

When I do it too quickly on demo systems, I sometimes forgot to write down the result of 'SHOW MASTER STATUS'
Never mind, on the mounted snapshot volume I can simply check which one was the latest binlog:

$ ls -ltr | tail
..
-rw-rw---- 1 mysql      mysql 1161283517 Oct 31 06:08 mysql-toto50-bin.003288
-rw-rw---- 1 mysql      mysql 261972409 Oct 31 06:59 mysql-toto50-bin.003289    <--HERE
-rw-r-x--- 1 mysql      mysql 268435456 Oct 31 06:59 ib_logfile1
-rw-r-x--- 1 mysql      mysql 268435456 Oct 31 06:59 ib_logfile0

And inside this binlog I can look up for the latest entry:

$ mysqlbinlog mysql-toto50-bin.003289 | tail
eGtTVBcyAAAAbgAAAJ5hnQ8AAKc4BwAAAAEABP/wCmxvY2tlZEZsb3cRaGVhcnRiZWF0LXdhdGNo
ZXIdAGNscHJpd3M1MC5wcml2LmRlc2phcmRpbnMuZGV2EzIwMTQtMTAtMzEgMDY6NTk6MDQ=
'/*!*/;
# at 261972382
#141031 6:59:04 server id 50 end_log_pos 261972409    Xid = 2164008837
COMMIT/*!*/;
DELIMITER ;
# End of log file
ROLLBACK /* added by mysqlbinlog */;
/*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;