Friday, March 20, 2015

MySql installation using RPM packages

(Source: http://dev.mysql.com/doc/refman/5.6/en/linux-installation-rpm.html)

To download MySQL community server, follow http://dev.mysql.com/downloads/mysql/ -> GA release -> linux-generic ->
MySQL-server-5.6.23-1.linux_glibc2.5.x86_64.rpm

To download MySQL community client, follow http://dev.mysql.com/downloads/mysql/ -> GA release -> linux-generic ->
MySQL-client-5.6.23-1.linux_glibc2.5.x86_64.rpm

yum localinstall Downloads/MySQL-server-5.6.23-1.linux_glibc2.5.x86_64.rpm

The above command internally executes all the below steps.

shell> groupadd mysql
shell> useradd -r -g mysql mysql
shell> cd /usr/local
shell> tar zxvf /path/to/mysql-VERSION-OS.tar.gz
shell> ln -s full-path-to-mysql-VERSION-OS mysql
shell> cd mysql
shell> chown -R mysql .
shell> chgrp -R mysql .
shell> scripts/mysql_install_db --user=mysql
shell> chown -R root .
shell> chown -R mysql data
shell> bin/mysqld_safe --user=mysql &
# Next command is optional
shell> cp support-files/mysql.server /etc/init.d/mysql.server

less /etc/groups#should display mysql group
which mysqld #should display /usr/sbin/mysqld
ls /etc/rc.d/init.d #should contain mysql file. This file is used to start MySQL as system service.
chkconfig --list#should display MySQL with at least one flag ON
ls /var/lib/mysql# this is used as the data directory. This should contain all the core databases initialized by /usr/bin/mysqld_safe script.

yum localinstall Downloads/MySQL-client-5.6.23-1.linux_glibc2.5.x86_64.rpm

which mysql # should display /usr/bin/mysql

service mysql start # starts mysqld

mysql -uroot -ppassword # mysqld_safe stores password in file $home/.mysql-secret. If file not found in home directory, try to find in / directory.

As of MySQL 5.6.8, new RPM install operations (not upgrades) invoke mysql_install_db with the --random-passwords option that provides for more secure MySQL installation. Invoking mysql_install_db with --random-passwords causes it to assign a random password to the MySQL root accounts, set the “password expired” flag for those accounts, and not create anonymous-user MySQL accounts. It will be necessary after installation to start the server, connect as root using the password written to the $HOME/.mysql_secret file, and assign a new root password. Until this is done, root cannot do anything else. This must be done for each root account you intend to use. To change the password, you can use the SET PASSWORD statement (for example, with the mysql client).

Mysql innodb Btree indexes

The Btree is simply a collection of one or more data pages, called nodes. The non-leaf nodes help in reaching the leaf nodes. Depending on the height of the tree, reaching leaf nodes might require traversing through multiple levels of non-leaf nodes.

In Innodb, the leaf nodes are nothing but index pages. The index pages contain index records.

For non-clustered (secondary) indexes, index record contains an index key value and a pointer to the data page containing the corresponding data record.

For clustered indexes (primary key) , the index pages are nothing but data pages.

(reference book: pro mysql, chapter 2 (index concepts))

Thursday, March 19, 2015

Why process forking?

If a a child process is forked from parent process, whenever child process crashes, linux will notify the parent process through events. For this parent process has to register a callback with linux, so that linux will call the callback with events.

When forking happens a child process is created which is an exact replica of parent process even with same register contents. Both processes continue to execute individually. For forking parent would have just executed fork() call. Upon execution parent will receive child process id as fork() return value. Child will get null as fork() return value.

Fork also has another advantage. We can share common data between both processes. Linux allows this.

Monday, March 16, 2015

Mysql InnoDB’s gap locks

InnoDB’s gap locks (http://www.mysqlperformanceblog.com/2012/03/27/innodbs-gap-locks/)

One of the most important features of InnoDB is the row level locking. This feature provides better concurrency under heavy write load but needs additional precautions to avoid phantom reads and to get a consistent Statement based replication. To accomplish that, row level locking databases also acquire gap locks.

What is a Phantom Read

A Phantom Read happens when in a running transaction, two identical statements get different values, because some other transaction has modified the table’s rows. For example:

transaction1> START TRANSACTION; transaction1> SELECT * FROM t WHERE i > 20 FOR UPDATE; +------+ | i | +------+ | 21 | | 25 | | 30 | +------+

transaction2> START TRANSACTION; transaction2> INSERT INTO t VALUES(26); transaction2> COMMIT;

transaction1> select * from t where i > 20 FOR UPDATE; +------+ | i | +------+ | 21 | | 25 | | 26 | | 30 | +------+

Phantom reads do not occur if you’re simply doing a SELECT. They only occur if you do UPDATE or DELETE or SELECT FOR UPDATE. InnoDB provides REPEATABLE READ for read-only SELECT, but it behaves as if you use READ COMMITTED for all write queries, in spite of your chosen transaction isolation level (considering only the two most common isolation levels, REPEATABLE READ and READ COMMITTED).

What is a gap lock?

A gap lock is a lock on the gap between index records. Thanks to this gap lock, when you run the same query twice, you get the same result, regardless other session modifications on that table. This makes reads consistent and therefore makes the replication between servers consistent. If you execute SELECT * FROM id > 1000 FOR UPDATE twice, you expect to get the same value twice. To accomplish that, InnoDB locks all index records found by the WHERE clause with an exclusive lock and the gaps between them with a shared gap lock.

This lock doesn’t only affect to SELECT … FOR UPDATE. This is an example with a DELETE statement:

transaction1 > SELECT * FROM t; +------+ | age | +------+ | 21 | | 25 | | 30 | +------+

Start a transaction and delete the record 25:

transaction1 > START TRANSACTION; transaction1 > DELETE FROM t WHERE age=25;

At this point we suppose that only the record 25 is locked. Then, we try to insert another value on the second session:

transaction2 > START TRANSACTION; transaction2 > INSERT INTO t VALUES(26); ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction transaction2 > INSERT INTO t VALUES(29); ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction transaction2 > INSERT INTO t VALUES(23); ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction transaction2 > INSERT INTO t VALUES(31); Query OK, 1 row affected (0.00 sec)

After running the delete statement on the first session, not only the affected index record has been locked but also the gap before and after that record with a shared gap lock preventing the insertion of data to other sessions.

How to troubleshoot gap locks?

Is possible to detect those gap locks using SHOW ENGINE INNODB STATUS:

---TRANSACTION 72C, ACTIVE 755 sec 4 lock struct(s), heap size 1248, 3 row lock(s), undo log entries 1 MySQL thread id 3, OS thread handle 0x7f84a78ba700, query id 163 localhost msandbox TABLE LOCK table `test`.`t` trx id 72C lock mode IX RECORD LOCKS space id 19 page no 4 n bits 80 index `age` of table `test`.`t` trx id 72C lock_mode X RECORD LOCKS space id 19 page no 3 n bits 80 index `GEN_CLUST_INDEX` of table `test`.`t` trx id 72C lock_mode X locks rec but not gap RECORD LOCKS space id 19 page no 4 n bits 80 index `age` of table `test`.`t` trx id 72C lock_mode X locks gap before rec

If you have lot of gaps locks in your transactions affecting the concurrency and the performance you can disable them in two different ways:

1- Change the ISOLATION level to READ COMMITTED. In this isolation level, it is normal and expected that query results can change during a transaction, so there is no need to create locks to prevent that from happening.
2- innodb_locks_unsafe_for_binlog = 1. Disables the gap locks except for foreign-key constraint checking or duplicate-key checking.

The most important difference between these two options is that the second one is a global variable that affects all sessions and needs a server restart to change its value. Both options cause phantom reads (non repeatable reads) so in order to prevent problems with the replication you should change the binary log format to “row”.

Depending on the statement, the behavior of these locks can be different. In the following link there is a good source of information:

http://dev.mysql.com/doc/refman/5.1/en/innodb-locks-set.html.

Conclusion
MySQL uses REPEATABLE READ as the default isolation level so it needs to lock the index records and the gaps to avoid phantom reads and to get a consistent Statement based replication. If your application can deal with phantom reads and your binary log is in row format, changing the ISOLATION to READ COMMITTED will help you to avoid all those extra locks. As a final advice, keep your transactions short

Mysql Table locks in SHOW INNODB STATUS

Table locks in SHOW INNODB STATUS

Quite frequently I see people confused what table locks reported by SHOW INNODB STATUS really mean. Check this out for example:

---TRANSACTION 0 4872, ACTIVE 32 sec, process no 7142, OS thread id 1141287232
2 lock struct(s), heap size 368
MySQL thread id 8, query id 164 localhost root
TABLE LOCK table `test/t1` trx id 0 4872 lock mode IX

This output gives us an impression Innodb has taken table lock on test/t1 table and many people tend to think Innodb in fact in some circumstances would abandon its row level locking and use table locks instead. I’ve seen various theories ranging from lock escalation to usinghttp://www.mysqlperformanceblog.com/2012/03/27/innodbs-gap-locks/ table locks in special cases, for example when no indexes are defined on the table. None of this is right.

In fact Innodb uses Multiple Granularity Locking and there is always lock taken on the whole table before individual locks can be locked.
Such locks are called intention lock, hence abbreviation IX = Intention eXclusive. Intention locks do not work the same way as table locks – Intention exclusive lock does not prevent other threads taking intention shared or even intention exclusive locks on the same table.

What does Intention mean ? Just what it says. If Innodb sets intention exclusive lock on the table this means it plans to lock some of the rows in exclusive mode. What would these be used for ? They are used to be able to handle operation on the whole table – for example to drop the table you need to lock it exclusively.

So do not worry intention table locks you may observe in SHOW INNODB STATUS output, they almost never would be cause of your lock waits or deadlocks.

Mysql Differences between READ-COMMITTED and REPEATABLE-READ

Differences between READ-COMMITTED and REPEATABLE-READ transaction isolation levels (http://www.mysqlperformanceblog.com/2012/08/28/differences-between-read-committed-and-repeatable-read-transaction-isolation-levels/)

As an instructor with Percona I’m sometimes asked about the differences between the READ COMMITTED and REPEATABLE READ transaction isolation levels. There are a few differences between READ-COMMITTED and REPEATABLE-READ, and they are all related to locking.

Extra locking (not gap locking)
It is important to remember that InnoDB actually locks index entries, not rows. During the execution of a statement InnoDB must lock every entry in the index that it traverses to find the rows it is modifying. It must do this to prevent deadlocks and maintain the isolation level.

If you run an UPDATE that is not well indexed you will lock many rows:

update employees set store_id = 0 where store_id = 1;
---TRANSACTION 1EAB04, ACTIVE 7 sec
633 lock struct(s), <b>heap size 96696</b>, 218786 row lock(s), undo log entries 1
MySQL thread id 4, OS thread handle 0x7f8dfc35d700, query id 47 localhost root
show engine innodb status

In the employees table, the column store_id is not indexed. Notice that the UPDATE has completed running (we are now running SHOW ENGINE …) but we are holding 218786 row locks and only one undo entry. This means that only one row was changed, but we are still holding extra locks. The heap size represents the amount of memory that has been allocated for locks.

In REPEATABLE READ every lock acquired during a transaction is held for the duration of the transaction.

In READ COMMITTED the locks that did not match the scan are released after the STATEMENT completes.

Here is the UPDATE statement repeated under READ COMMITTED:

---TRANSACTION 1EAB06, ACTIVE 11 sec
631 lock struct(s), <b>heap size 96696</b>, 1 row lock(s), undo log entries 1
MySQL thread id 4, OS thread handle 0x7f8dfc35d700, query id 62 localhost root
show engine innodb status

You’ll notice that the heap size is the same, but we are now holding only one lock. In all transaction isolation levels InnoDB creates locks over every index entry scanned. The difference between the levels is that once the statement completes in READ COMMITTED mode, the locks are released for the entries that did not match the scan. Note that InnoDB does not immediately release the heap memory back after releasing the locks, so the heap size is the same as as that in REPEATABLE READ, but the number of locks held is lower (only one).

This means that in READ COMMITTED other transactions are free to update rows that they would not have been able to update (in REPEATABLE READ) once the UPDATE statement completes.

Consistent read views
In REPEATBLE READ, a ‘read view’ ( trx_no does not see trx_id >= ABC, sees < ABB ) is created at the start of the transaction, and this read view (consistent snapshot in Oracle terms) is held open for the duration of the transaction. If you execute a SELECT statement at 5AM, and come back in an open transaction at 5PM, when you run the same SELECT, then you will see the exact same resultset that you saw at 5AM. This is called MVCC (multiple version concurrency control) and it is accomplished using row versioning and UNDO information.

In REPEATABLE READ InnoDB also creates gap locks for range scans.

select * from some_table where id > 100 FOR UPDATE;

The above update will create a gap lock that will prevent any rows with id > 100 from being inserted into the table until the transaction rolls back or commits.

In the same transaction, if the SELECT … FOR UPDATE is run at 5AM, and an UPDATE is run at 5PM (“UPDATE some_table where id > 100″) then the UPDATE will change the same rows that SELECT FOR UPDATE locked at 5AM. There is no possibility of changing additional rows, because the gap after 100 was previously locked.

Non-repeatable reads (read committed) (http://www.mysqlperformanceblog.com/2010/06/08/table-locks-in-show-innodb-status/)
In READ COMMITTED, a read view is created at the start of each statement. This means that a SELECT made at 5AM may show different results from the same SELECT run at 5PM, even in the same transaction. This is because in READ COMMITTED the read view for the transaction lasts only as long as each statement execution. As a result, consecutive executions of the same statement may show different results.

This is called the ‘phantom row’ problem.

In addition, in READ COMMITTED gap locks are never created. Since there is no gap lock, the example SELECT .. FOR UPDATE above will not prevent insertions of new rows into the table by other transactions. Thus, locking rows with SELECT … FOR UPDATE (ie “where id> 100″) and subsequently updating rows with “where id> 100″ (even in the same transaction) may result in more rows being updated than were earlier locked. This is because new rows may have been inserted in the table between the statements since there was no gap lock created for the SELECT … FOR UPDATE.

Saturday, March 14, 2015

How JPA synchronizes related entities to database

When we perform em.*(entity1) and entity1 has relationship with entity2 then em atleast tries to maintain the relationships information in the database even though cascade is not declared for the performed operation. If em doesn't have enough information to store the defined relationships in the database, then it throws exception. Relations are maintained in the database by storing foreign keys to the parent table. Hence em expects access to the join column owning entity. If entity1 is the owner of the relationship, then entity1 table would be holding the join column. Here to retain the relationship, we just need to store entity2's id in the foreign key column of entity1. For this to happen the minimum requirement is entity2 should be an already persistent entity (because we need it's id in the database) , it can be detached also. If entity2 owns the join column and is already persistent then em has no work. If entity2 owns the join column and is not already persistent, then we can either call em. persist(entity2) or declare cascade(persist) on entity1.

Saturday, March 7, 2015

how mysql varbinary datatype works

Consider a table 'Collection' has the below columns:

collectionId SMALLINT UNSIGNED NOT NULL AUTO_INCREMENT,
allocationBitMap VARBINARY(1000) NOT NULL,
deviceCount SMALLINT NOT NULL DEFAULT 0,

Try to save a String value '1' into VARBINARY column as shown below. Any value in single quotes is treated as String by Mysql:

insert into Collection(allocationBitMap,deviceCount) values('1',1);

If you query the inserted data, you'll notice '1' being stored as 0x31 (in hex represenation) 49 (in decimal representation. Based on the character set of the OS, numerical representation for the character '1' is stored in the database.

collectionId;HEX(allocationBitMap);deviceCount
1;0x31;1

Try to save hex value 0x12345 into VARBINARY column:

insert into Collection(allocationBitMap,deviceCount) values(0x12345,1);



If you query the inserted data, you'll notice 0x12345 being stored as 0x012345. VARBINARY stores data in bytes. It starts with least significant bits, collects 8 bits, stores them in a bytes, collects next set of 8 bits stores them in the next byte. In our example it starts with the first set of least significant 8 bits which is 0x45, next least is 0x23, next least is 1. As byte is the storage unit one is converted to 0x01 and stored in the next byte. 
collectionId;HEX(allocationBitMap);deviceCount

2;0x012345;1

When I read this data into Java program byte[], it is represented as below in the debugger window. Debugger displayed decimal value for 0x01, 0x23, 0x45:


allocationBitMap (id=69) 
 [0] 1 
 [1] 35 
 [2] 69

When we try to display VARBINARY value, how the value get displayed is dependent on the client. 
If the client is Console, it tries to display every thing it read on the screen. So it considers every thing it read from the database to be a numerical representation of ASCII character set and tries to map those numerical codes to ASCII characters. If the binary data doesn't belong to ASCII character set then some unidentified symbols get displayed on the screen. 


If the client is more intelligent MySql client, then it knows about VARBINARY mysql type, It won't try to convert them to a specific character set as VARBINARY can hold any kind of binary data (photos, vedios) that won't map to any character set.

technical