Regular Expression feature in Notepad++ (And some other advance Text Editor)

今天请病假,但是还要在家里做工,显。。。
无所谓,趁机来这里 update 一下,
以上牢骚以中文书写,包同事看不懂,看得懂的也无所谓。

Last time I receive a list of mobile number in .csv file, I need to insert those mobile numbers into temporary table in SQL Server.

The data in .csv file looks like below:

60121234567
60122345678
60133456789
60164567890
60195678901
.
.
.
60140011223

There are few ways to do that, one of the method is using DTS (Or SSIS in MSSQL 2005) to transfer data. However I need to login to Remote Desktop (RD), transfer the .csv file to remote server to do that.

And that day the only two slots for RD already used by my colleagues. So I need to figure out other way.

Ok, I was thinking this way:

  1. First, since I can connect to remote SQL Server from my local PC, so I can create a temp table from my PC using create table script.
  2. After I got temp table, I can run the INSERT INTO [TEMP_TABLE] (mobile) VALUES (‘601XXXXXXXX’) statement from local PC.
  3. Duplicate the insert statement in # 2 for all mobile numbers, and run it from SQL Query.
  4. Now I will have a temp table with all mobile numbers inside, no need DTS or SSIS.

Step 1 is not an issue for me, but when come to step 2, I need to append the word INSERT INTO [TEMP_TABLE](mobile) VALUES (‘<mobile-number-in-each-row>‘) to every rows in the .csv file.

So the best way is to replace mobile numbers in each row with the BOLD words above, ok, replace … , replace … , replace function in Notepad++ (Editor I using now), then I press CTRL + H after open Notepad++.

Ehh~~?!, why the Notepad++ search box has one option to pick the search mode for “Regular Expression”? Arrrhh~~~~, I knew it, maybe the Notepad++ author already foresee the situation I facing now.

That is great, I make a quick search on http://notepad-plus.sourceforge.net/uk/regExpList.php, then give a try to my .csv file, haha, it work now.

Description:

  1. Open Notepad++, then press CTRL + H.
  2. Key in “(601[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9])” in “Find What” textbox.
  3. Key in “INSERT INTO [0_TEMP] VALUES(‘\1’)” in “Replace With” textbox.
  4. Choose “Regular Expression” in “Search Mode”.
  5. Click “Replace All” button.
  6. Done, yeah.

Below is how the search box should look like (may vary on different version):

Regular expression replace feature in Notepad++Regular expression replace feature in Notepad++

The final outcome will look like this:

INSERT INTO [0_TEMP] VALUES(‘60121234567’)
INSERT INTO [0_TEMP] VALUES(‘60122345678’)
INSERT INTO [0_TEMP] VALUES(‘60133456789’)
INSERT INTO [0_TEMP] VALUES(‘60164567890’)
INSERT INTO [0_TEMP] VALUES(‘60195678901’)
.
.
.
INSERT INTO [0_TEMP] VALUES(‘60140011223’)

Just a brief explanation to the regexp pattern above (easy for me for future reference):

  1. The pattern I using now only support by Notepad++, it is not global regexp pattern.
  2. Pattern enclose in “(” and “)” is consider as single parameter that going to be replace later.
  3. Notepad++ can recognize the target parameter by using \1, \2, \3 until \9. In my case I only use “\1”.

== [ End ] ==

Words after end: If notepad++ has the features: 1) Column edit, and 2) Hex Edit, then I will vote Notepad++ as the best editor, currently I will say UltraEdit is the best although the interface is qute ugly.

Search Keyword: Tips, Tricks, Notepad++, Tools,  Regular expression, regexp, text editor.

Regular Expression checking for Malaysia mobile no.

Malaysia mobile no is in format 601XXXXXXXX (You can find more details from http://www.skmm.gov.my/)

I start learning regular rexpression when I want to do mobile no. validation, basically just refer to two pages, Regular Expression CheatSheet and Regular Expression Tester, a very useful resources to me.

Below is my RegExp test code for mobileNo validation:

^6?01\d{8}$

Let go through the code (Refer to Regular Expression CheatSheet):

  1. This reg. exp. only accept format in (601AAAAAAAA) or (01AAAAAAAA), A is digit from 0 ~ 9.
  2. ‘^601’ indicates that mobile number must start with ‘601’.
  3. Since I want to allow mobile no. in format (01AAAAAAAA), so add a ‘?’ after ‘^6’, it become like this ‘^6?01’. ‘?’ means there is no matching of previous expression or maximum one matching of previous expression, ‘6’ is the one.
  4. In (601AAAAAAAA), notice that A x 8, so the regexp follow by ‘\d{8}’, ‘\d’ match any decimal digit, ‘{8}’ says that any eight decimal digit after that will be match. So the test code is ‘^6?01\d{8}’.
  5. Yeah the story end. ‘^6?01\d{8}’ will be the final code for my validation.
  6. After some time, I found the number ‘60191234567999999’ can pass my validation, oops~~, need to fix the checking.
  7. After browses some regexp example, ‘$’ is found to prevent above error (previously I do not know what is that ‘$’ serve for).
  8. The ‘$’ will make sure ONLY eight decimal digit in the mobile no., so now ‘60191234567999999’ wouldn’t pass my test since there are 13 digits after ‘6019’, but ‘60191234567’ will pass the test.
  9. Ok, story end again.

Anybody has better test code are welcome. 🙂

Reference:

  1. http://regexlib.com/CheatSheet.aspx
  2. http://regexlib.com/RETester.aspx
  3. http://regexlib.com/DisplayPatterns.aspx (Regexp examples)