Split String in MySQL

The easiest way to split string based on the delimiter is using loop. But it is also the most inefficient way. The following is to demonstrate how to use a utility table number and a scalar function to split a string. It is much faster than loop.

— First create the utility table number:

CREATE TABLE IF NOT EXISTS number (
number int not null,
PRIMARY KEY (number)
) ENGINE=InnoDB ROW_FORMAT=COMPACT
COMMENT ‘Utility table with continuous numbers’;

— populate number table with numbers
INSERT INTO number(number)
SELECT @row := @row + 1 as number
FROM (select 0 union all select 1 union all select 3 union all select 4 union all select 5 union all select 6 union all select 6 union all select 7 union all select 8 union all select 9) t,
(select 0 union all select 1 union all select 3 union all select 4 union all select 5 union all select 6 union all select 6 union all select 7 union all select 8 union all select 9) t2,
(select 0 union all select 1 union all select 3 union all select 4 union all select 5 union all select 6 union all select 6 union all select 7 union all select 8 union all select 9) t3,
(select 0 union all select 1 union all select 3 union all select 4 union all select 5 union all select 6 union all select 6 union all select 7 union all select 8 union all select 9) t4,
(select 0 union all select 1 union all select 3 union all select 4 union all select 5 union all select 6 union all select 6 union all select 7 union all select 8 union all select 9) t5,
(SELECT @row:=0) r

The code above populates 100000 rows (1 – 100000) into the table number. The query cross joins 5 subqueries, each has 10 rows built by the UNION ALL.  The (SELECT @row”=0) just initialize the session variable @row to 0. We can remove it from this INSERT query  by  initializing it ahead of it, e.g. SET @row=0;

— Let’s create the scalar function:

DELIMITER $$
DROP FUNCTION IF EXISTS ufn_split_string
$$

CREATE FUNCTION ufn_split_string(
x text,
delim VARCHAR(12),
pos INT
)
RETURNS VARCHAR(255)
LANGUAGE SQL
DETERMINISTIC
RETURN REPLACE(SUBSTRING(SUBSTRING_INDEX(x, delim, pos),
CHAR_LENGTH(SUBSTRING_INDEX(x, delim, pos -1)) + 1),
delim, ”)
$$

The function returns the Nth (specified by parameter pos) delimited substring in the string x. The delimiter is delim. It uses SUBSTRING_INDEX MySQL built-in function, which returns a substring from a string before the specified number of occurrences of the delimiter. The scalar function calls SUBSTRING_INDEX twice to get the Nth and and (N-1)th occurrence of the delimiter and REPLACE the latter one with empty string in the first one so we get the substring delimited by (N-1)th and Nth occurrence of the delimiter.

To split a delimited string, we need to use the scalar function ufn_split_string and the number table together:

SET @str=’AA,BB,cc,DD,ee’;
SELECT a.val FROM (
SELECT CAST(ufn_split_string(@str,’,’,n.number) AS char(255)) AS val
FROM number n
WHERE n.number<=(CHAR_LENGTH(@str)/2)) a
where a.val<>”;

it returns:

val
AA
BB
cc
DD
ee

If you know the number of items in the string, you can replace (CHAR_LENGTH(@str)/2) with the real number.

Advertisements
This entry was posted in MySQL. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s