Thursday, March 8, 2012

MySQL Quick Hit: Drop First Word

Here is a quick tidbit you might find helpful.

Recently I ran into a scenario where a number of records in a table named contacts had an undesired word at the beginning of the value in the last_name column. Here is an example:

Table: Contacts
first_name       last_name
Angel            Zyx Magaña

This complicated the searching of records based on last name because one had to use a wildcard or search for the preceding value in the last_name field, along with the desired value, hence the need to adjust the data. 

To adjust the data, it was necessary to execute an UPDATE query against the table in question. However, it was important to take into consideration the fact that there were some records in the table that did not have the offending value preceding the actual last name. The only real way to differentiate between valid and invalid entries was to assume that any record with more than 1 word in the last_name field required adjustment.

Side note: The caveat with this approach is that valid last name values that use 2 words, such as Mac Donald, would also be affected. However, the number of records meeting that criteria was very low and could easily be corrected manually at a later time.

Based on this strategy, our first step would be to figure out the word count on the field. Unfortunately, MySQL does not provide a function to do this, but a reasonable alternative would be to figure out if there if a space exists in the value of the field. A space in the field would equate to a separation between two or more words. For this, MySQL does provide a function, called LOCATE()

Given our main concern was to drop the first word in the field, i.e. anything before the first instance of a space, the LOCATE() function serves our purposes rather well. Next, our query had to instruct MySQL to update the last_name column with the value following the space, whether it was 1 or 4 additional words. Finally, we also had to limit it to records where a space actually existed between words. This latter part is accomplished by the WHERE clause in our final SQL query, demonstrated below:

UPDATE contacts SET last_name = TRIM(SUBSTR(last_name, LOCATE(' ', TRIM(last_name))))
WHERE LOCATE(' ', TRIM(last_name)) > 0

In case you are wondering about the use of the TRIM() function, it is to eliminate any preceding and trailing spaces in the value.

No comments:

Post a Comment

Your comments, feedback and suggestions are welcome, but please refrain from using offensive language and/or berating others. Thank you in advance.