Extracting numbers from alphanumeric strings


Most Valuable Professional icon

This article was written by Ashish Mathur, a Microsoft MVP (Most Valuable Professional). Visit the Microsoft MVP Web site for more information.

In the following article I explain a formula that extracts numbers from strings in the following types of situations:

  • When alphabetic and number characters are clustered together, such as abc123 or 678sfr.
  • When alphabetic and number characters are not clustered together, such as abc15tni.

Problem

How to extract the numeric portion of an alphanumeric string. For example: If cell A1 contains the string "abc123", return the value 123 in cell B1.

Solution

The underlying logic here is to search for and return the first number in the alphanumeric string, and then return only the numbers that follow.

Algorithm

The solution involves creating formulas to accomplish the following tasks:

  1. Break the alphanumeric string into separate characters.
  2. Determine whether there is a number in the decomposed string.
  3. Determine the position of the number in the alphanumeric string.
  4. Count the numbers in the alphanumeric string.

We will consider each of these tasks separately, and then integrate the formulas to achieve the final result.

Break the alphanumeric string into separate characters

Use the MID function here. MID returns a specific number of characters from a text string, starting at the position you specify, based on the number of characters you specify. The syntax of the function is:

MID(text,start_num,num_chars)
  • Text   The text string containing the characters you want to extract.
  • Start_num   The position of the first character you want to extract in text. The first character in text has start_num 1, and so on.
  • Num_chars   Specifies the number of characters you want MID to return from text.

For our example, the formula is:

=MID(A1,ROW($1:$9),1)
This formula decomposes the alphanumeric string and puts the characters — virtually — into different rows of the worksheet. For example, for alphanumeric string abc123, all the 6 characters would be segregated.

 Note   The number 9 may be suitably increased to any higher number to accommodate larger strings. In this example, the maximum string length is 9.

It is worthwhile to note that when the string is decomposed, "1", "2" and "3" are treated as text — not as numbers. To convert the numbers stored as text into numbers, multiply the formula by 1. For example:

=1*MID(A1,ROW($1:$9),1)

Determine whether there is a number in the decomposed string

Here we employ the ISNUMBER function, which determines whether there is a number in the alphanumeric string. The formula now becomes:

=ISNUMBER(1*MID(A1,ROW($1:$9),1))

If there is a number in the string, the result is TRUE, otherwise it is FALSE.

Determine the position of the number in the alphanumeric string

Now we determine the position of the number by locating a TRUE value in the result of the decomposed string mentioned in the paragraph above. We employ the MATCH function here. The revised formula now becomes:

=MATCH(TRUE,ISNUMBER(1*MID(A1,ROW($1:$9),1)),0)

 Important   You must enter this formula as an array by pressing CTRL+SHIFT+ENTER.

If the string is abc123, this formula will yield 4 — that being the position of the first numeric character in the alphanumeric string.

Count the numbers in the alphanumeric string

The task now is to count the numbers in the string, to determine the characters to be returned after the first number in the alphanumeric string.

As mentioned above, the numbers stored as text in the alphanumeric string can be converted into numbers by multiplying them by 1. For example, =1*MID(A1,ROW($1:$9),1)

After the numbers stored as text are converted to numbers, it is possible to count them by using the COUNT function. The numbers can be counted by entering the following formula:

=COUNT(1*MID(A1,ROW($1:$9),1))

Integrate the formulas

We now employ the MID function to integrate the various parts of the formula, as shown in the following example.

=MID(A1,MATCH(TRUE,ISNUMBER(1*MID(A1,ROW($1:$9),1)),0),COUNT(1*MID(A1,ROW($1:$9),1)))

In basic terms, the problem can now be stated as: Determine the position of the first number in the alphanumeric string (in cell A1). Return that number and the numbers that follow.

To convert the resulting characters to numbers, multiply the formula by 1. While this is not strictly required, it should be done if you are going to perform mathematical operations on the result. This is the final formula to be entered in cell B1:

=1*MID(A1,MATCH(TRUE,ISNUMBER(1*MID(A1,ROW($1:$9),1)),0),COUNT(1*MID(A1,ROW($1:$9),1)))

 Important   You must enter these formulas as arrays by pressing CTRL+SHIFT+ENTER.

More examples

To test the formula further, enter the data in the following picture into cells A1:A7 of a blank worksheet.

Example of string conversion

Enter the formula in cell B1, and then use AutoFill copy the formula into cells B2:B7. (Remember to press CTRL+SHIFT+ENTER.)

It is worth mentioning here that if the string is yur09875reew and if you use the formula that is multiplied by 1, the result in column B will be 9875 and not 09875. Since 0*1=0, the 0 is ignored and the result returned is 9875. If you want the result to be 09875, do not multiply the complete formula by 1.
 
 
Applies to:
Excel 2003