Skip to content
В этом кратком руководстве показано, как можно быстро извлекать число из различных текстовых выражений в Excel с помощью формул или специального инструмента «Извлечь».
Проблема выделения числа из текста возникает достаточно часто, особенно когда вы работаете с данными, полученными из других программ. К примеру, нужно вытащить почтовый индекс из адреса, номенклатурный номер из строки с наименованием товара, номер счета из платежного документа. Нужное нам число может находиться в любом месте текста — в начале, в середине или в конце.
Вот что мы рассмотрим в этой статье:
- Как извлечь число в конце текста
- Получаем число из начала текста
- Как извлечь все числа из текста
- Извлекаем числа без формул при помощи Ultimate Suite
Когда дело доходит до извлечения части текстового значения заданной длины, Эксель предоставляет три текстовых функции (ЛЕВСИМВ, ПРАВСИМВ и ПСТР) для быстрого выполнения этой задачи. А вот когда дело доходит до извлечения числа из буквенно-цифровой строки, Microsoft Excel … не предоставляет ничего.
Чтобы извлечь число из текста в Excel, требуется немного изобретательности, немного терпения и множество различных функций, вложенных друг в друга.
Или вы можете запустить инструмент «Извлечь (Extract)» из надстройки Ablebits Ultimate Suite и выполнить эту операцию одним щелчком мыши. Ниже вы найдете полную информацию обо всех этих методах.
Как извлечь число из конца текстовой строки.
Если у вас есть столбец буквенно-цифровых значений, в котором число всегда идет после текста, вы можете использовать одну из следующих формул, чтобы вытащить из них числа.
Важное замечание! В приведенных ниже формулах извлечение выполняется с помощью функций ПРАВСИМВ и ЛЕВСИМВ, которые относятся к категории текстовых функций. Эти функции всегда возвращают текст. В нашем случае результатом будет числовая подстрока, которая с точки зрения Excel также является текстом, а не числом. Если вам нужно, чтобы результат был числом (которое можно использовать в дальнейших вычислениях), оберните соответствующую формулу в функцию ЗНАЧЕН, или выполните с ней простейшую математическую операцию (например, двойное отрицание).
Чтобы извлечь число из строки «текстовое число», первое, что вам нужно знать, — это с какой позиции начать операцию. Итак, давайте определим положение первой цифры с помощью этой общего выражения:
=МИН(ПОИСК({0;1;2;3;4;5;6;7;8;9}; ячейка &»0123456789″))
О логике вычислений мы поговорим чуть позже. На данный момент просто замените слово «ячейка» ссылкой на позицию, содержащую исходный текст (в нашем случае A2), и запишите получившееся выражение в любую пустую клетку той же строки, скажем, в B2:
=МИН(ПОИСК({0;1;2;3;4;5;6;7;8;9};A2&»0123456789″))
Хотя формула содержит константу массива, это обычное выражение, которое вводится обычным способом: нажатием клавиши Enter.
Как только позиция первой цифры определена, можно использовать функцию ПРАВСИМВ для извлечения числа. Чтобы узнать, сколько символов нужно извлечь, вы вычитаете позицию первой цифры из общей длины строки и добавляете единицу к результату, потому что первая цифра также должна быть включена:
=ПРАВСИМВ(A2;ДЛСТР(A2)-B2+1)
Где A2 — исходная ячейка, а B2 — позиция первой цифры.
На следующем скриншоте показаны результаты:
Чтобы исключить вспомогательный столбец, содержащий позицию первой цифры, вы можете встроить формулу МИН непосредственно в функцию ПРАВСИМВ следующим образом:
=ПРАВСИМВ(A2;ДЛСТР(A2)-МИН(ПОИСК({0;1;2;3;4;5;6;7;8;9};A2&»0123456789″))+1)
Чтобы формула возвращала именно число, а не числовую строку, вложите ее в функцию ЗНАЧЕН:
=ЗНАЧЕН(ПРАВСИМВ(A2;ДЛСТР(A2)-МИН(ПОИСК({0;1;2;3;4;5;6;7;8;9};A2&»0123456789″))+1))
Или просто примените двойное отрицание, использовав два знака «минус»:
=—ПРАВСИМВ(A2;ДЛСТР(A2)-МИН(ПОИСК({0;1;2;3;4;5;6;7;8;9};A2&»0123456789″))+1)
Другой способ извлечь число из конца строки — использовать вот такое выражение:
=ПРАВСИМВ( ячейка ;СУММ(ДЛСТР( ячейка ) — ДЛСТР(ПОДСТАВИТЬ( ячейка ; {«0″;»1″;»2″;»3″;»4″;»5″;»6″;»7″;»8″;»9″};»»))))
Используя исходный текст в A2, вы записываете приведенную ниже формулу в B2 или любую другую пустую ячейку в той же строке, а затем копируете её вниз по столбцу:
=ПРАВСИМВ(A2;СУММ(ДЛСТР(A2) — ДЛСТР(ПОДСТАВИТЬ(A2; {«0″;»1″;»2″;»3″;»4″;»5″;»6″;»7″;»8″;»9″};»»))))
Примечание. Эти формулы предназначены для случая, когда числа находятся только в конце текстовой строки. Если некоторые цифры также находятся в середине или в начале, то ничего не будет работать.
Этих недостатков не имеет третья формула, которая извлекает только последнее число в тексте, игнорируя все предыдущие:
=ПРАВСИМВ(A2; ДЛСТР(A2) — МАКС(ЕСЛИ(ЕЧИСЛО(ПСТР(A2; СТРОКА(ДВССЫЛ( «1:»&ДЛСТР(A2))); 1) *1)=ЛОЖЬ; СТРОКА(ДВССЫЛ( «1:»&ДЛСТР(A2))); 0)))
На скриншоте ниже вы видите результат ее работы.
Как видите, цифры в начале или в середине текста игнорируются. Также обратите внимание, что результатом, как и в предыдущих формулах, является число, записанное в виде текста. Как превратить его в нормальное число, мы уже рассмотрели выше в этой статье.
Примечание. Если вы используете Excel 2019 или более ранние версии, нужно использовать формулу массива, нажав при вводе комбинацию Ctrl+Shift+Enter
. Если у вас Office365, вводите как обычно, через Enter
.
Как извлечь число из начала текстовой строки
Если вы работаете со строками, в которых текст находится после числа, решение для извлечения числа будет аналогично описанному выше. С той только разницей, что вы используете функцию ЛЕВСИМВ для извлечения из левой части текста:
=ЛЕВСИМВ( ячейка ;СУММ(ДЛСТР( ячейка )-ДЛСТР(ПОДСТАВИТЬ( ячейка ;{«0″;»1″;»2″;»3″;»4″;»5″;»6″;»7″;»8″;»9″};»»))))
Используя этот метод для A2, извлекаем число при помощи такого выражения:
=ЛЕВСИМВ(A2;СУММ(ДЛСТР(A2)-ДЛСТР(ПОДСТАВИТЬ(A2;{«0″;»1″;»2″;»3″;»4″;»5″;»6″;»7″;»8″;»9″};»»))))
Это решение работает для текстовых выражений, которые содержат числа только в начале. Если некоторые цифры также находятся в середине или в конце строки, формула не будет работать.
Если вы хотите извлечь только числа слева и игнорировать остальные, воспользуйтесь другой формулой:
=ЛЕВСИМВ(A2;ПОИСКПОЗ(ЛОЖЬ;ЕЧИСЛО(—ПСТР(A2;СТРОКА($1:$94);1));0)-1)
Или чуть модифицируем, чтобы ускорить расчеты:
=ЛЕВСИМВ(A2; ПОИСКПОЗ(ЛОЖЬ; ЕЧИСЛО(ПСТР(A2; СТРОКА(ДВССЫЛ( «1:»&ДЛСТР(A2)+1)); 1) *1); 0) -1)
Если у вас Excel 2019 и ниже, вводите ее как формулу массива, используя Ctrl+Shift+Enter
. В Office365 и выше можно вводить как обычно.
Примечание. Как и в случае с функцией ПРАВСИМВ, функция ЛЕВСИМВ также возвращает числовую подстроку, которая технически является текстом, а не числом.
Как получить число из любой позиции в тексте
Если ваша задача подразумевает извлечение числа из любого места строки, вы можете использовать следующую формулу:
=СУММПРОИЗВ(ПСТР(0&A2; НАИБОЛЬШИЙ(ИНДЕКС(ЕЧИСЛО(—ПСТР(A2; СТРОКА(ДВССЫЛ(«1:»&ДЛСТР(A2))); 1)) * СТРОКА(ДВССЫЛ(«1:»&ДЛСТР(A2))); 0); СТРОКА(ДВССЫЛ(«1:»&ДЛСТР(A2))))+1; 1) * 10^СТРОКА(ДВССЫЛ(«1:»&ДЛСТР(A2)))/10)
Где A2 — исходная текстовая строка.
Для пояснения, как это работает, потребуется отдельная статья. Поэтому вы можете просто скопировать на свой рабочий лист, чтобы убедиться, что это действительно работает
Обратите внимание, что в этом случае в тексте могут находиться несколько чисел. Все они будут извлечены и объединены в единое целое.
Однако, изучив результаты, вы можете заметить один незначительный недостаток: если исходный текст в ячейке не содержит числа, формула возвращает ноль, как в строке 7 на скриншоте выше. Чтобы исправить это, вы можете заключить формулу в оператор ЕСЛИ, который проверит, содержит ли исходный текст какое-либо число. Если это так, формула извлекает это число, в противном случае возвращает пустую строку:
=ЕСЛИ(СУММ(ДЛСТР(A2)-ДЛСТР(ПОДСТАВИТЬ(A2;{«0″;»1″;»2″;»3″;»4″;»5″;»6″;»7″;»8″;»9″};»»)))>0; СУММПРОИЗВ(ПСТР(0&A2; НАИБОЛЬШИЙ(ИНДЕКС(ЕЧИСЛО(—ПСТР(A2; СТРОКА(ДВССЫЛ(«1:»&ДЛСТР(A2))); 1)) * СТРОКА(ДВССЫЛ(«1:»&ДЛСТР(A2))); 0); СТРОКА(ДВССЫЛ(«1:»&ДЛСТР(A2))))+1; 1) * 10^СТРОКА(ДВССЫЛ(«1:»&ДЛСТР(A2)))/10);»»)
В отличие от всех предыдущих примеров, результатом этих формул является число. Чтобы убедиться в этом, просто обратите внимание на выровненные по правому краю значения в столбце B и усеченные ведущие нули (например, 88 вместо 088).
Если число, которое вы хотите извлечь, ограничено какими-то знаками-разделителями, то можно использовать функцию ПСТР. Рассмотрим пример, как получить номер счета из текста платежа.
Мы будем искать позицию знака «№» и позицию следующего за ним первого пробела. То, что находится между ними, как раз и будет номером счёта:
=ПСТР(ПОДСТАВИТЬ(A2;» «;»»);НАЙТИ(«№»;ПОДСТАВИТЬ(A2;» «;»»))+1;НАЙТИ(» «;A2;НАЙТИ(«№»;A2;1))-НАЙТИ(«№»;A2;1)-1)
На скриншоте ниже вы видите, как это работает.
Вот еще один возможный вариант вынимания числа из текста. Необходимо извлечь первое встретившееся число из текста.
Используем формулу
=ПРОСМОТР(2^64;—ЛЕВСИМВ(ПСТР(A1&»_0″;МИН(НАЙТИ({0;1;2;3;4;5;6;7;8;9};A1&»_0123456789″));15); {1;2;3;4;5;6;7;8;9;10;11;12;13;14;15}))
или заменяем список цифр функцией:
=ПРОСМОТР(2^64;—ЛЕВСИМВ(ПСТР(A1&»_0″;МИН(НАЙТИ({0;1;2;3;4;5;6;7;8;9};A1&»_0123456789″));15); СТРОКА($A$1:$IV$16)))
Как видите, получаем только первое число, независимо от его расположения:
И еще один пример. Давайте попробуем достать все числа из текста, разграничив их каким-то разделителем. Например, дефисом “-“.
В этом случае придется использовать формулу массива:
{=ПОДСТАВИТЬ(СЖПРОБЕЛЫ(СЦЕП(ЕСЛИ(ЕЧИСЛО(—ПСТР(A2;СТРОКА($1:$94);1));ПСТР(A2;СТРОКА($1:$94);1);» «)));» «;»-«)}
Мы нашли все числа в тексте, как вы видите на скриншоте ниже:
Откорректировав эту формулу, вы можете использовать любой другой разделитель.
Поскольку между ними есть разделители, то вы легко можете распределить эти числа в отдельные ячейки. Как это сделать — читайте в материале 8 способов разделить ячейку Excel на две или несколько.
Как выделить число из текста с помощью Ultimate Suite
Как вы только что видели, не существует простой и понятной формулы Excel для извлечения чисел из текстовой строки. Если у вас есть трудности с пониманием формул или их настройкой для ваших наборов данных, вам может понравиться этот простой способ получить число из текста в Excel.
С надстройкой Ultimate Suite, добавленной на вашу ленту Excel, вы можете быстро получить число из любой буквенно-цифровой строки:
- Перейдите на вкладку Ablebits Data > Text и нажмите Извлечь (Extract) :
- Выделите все ячейки с данными, которые нужно обработать.
- На панели инструмента «Извлечь (Extract)» установите переключатель «Извлечь числа (Extract numbers)».
- В зависимости от того, хотите ли вы, чтобы результаты были формулами или значениями, выберите поле «Вставить как формулу (Insert as formula)» или оставьте его пустым (по умолчанию).
Я советую активировать это поле, если вы хотите, чтобы извлеченные числа обновлялись автоматически, как только в исходные значения вносятся какие-либо изменения. Если нужно, чтобы результаты не зависели от будущих изменений (например, если вы планируете удалить исходные данные позже), не используйте эту опцию.
- Нажмите кнопку «Вставить результаты (Insert Results)». Готово!
Как и в предыдущем примере, результаты извлечения являются числами. Это означает, что вы можете подсчитывать, суммировать, усреднять или выполнять любые другие вычисления с ними.
Если установлен флажок «Вставить как формулу», вы увидите выражение в строке формул. Любопытно узнать, какое именно? Просто скачайте пробную версию Ultimate Suite и убедитесь сами
Если вы хотите иметь это, а также еще более 60 полезных инструментов в Excel, воспользуйтесь этой надстройкой.
Я постарался дать вам максимально полные рекомендации, какими способами можно извлечь число из текста. Конечно, они не могут охватить все возможные случаи. Поэтому если встретилось что-то особенно заковыристое — не стесняйтесь писать в комментариях. Постараюсь помочь по мере сил.
Extracting numbers from a list of cells with mixed text is a common data cleaning task.
Unfortunately, there is no direct menu or function created in Excel to help us accomplish this.
In this tutorial we will look at three cases where you might have a list of mixed text, from which you might want to extract numbers:
- When the number is always at the end of the text
- When the number is always at the beginning of the text
- When the number can be anywhere in the text
We will look at three different formulas that can be used to extract the numbers in each case.
At the end of the tutorial, we will also take a look at some VBA code that you can use to accomplish the same.
Brace yourself, this might get a little complex!
Extracting a Number from Mixed Text when Number is Always at the End of the Text
Consider the following example:
Here, each cell has a mix of text and numbers, with the number always appearing at the end of the text. In such cases, we will need to use a combination of nested Excel functions to extract the numbers.
The functions we will use are:
- FIND – This function searches for a character or string in another string and returns its position.
- MIN – This function returns the smallest value in a list.
- LEFT – This function extracts a given number of characters from the left side of a string.
- SUBSTITUTE – This function replaces a particular substring of a given text with another substring.
- IFERROR – This function returns an alternative result or formula if it finds an error in a given formula.
Essentially, we will be using the above functions altogether to perform the following sequence of tasks:
- Find the position of the first numeric value in the given cell
- Extract and remove the text part of the given cell (by removing everything to the left of the first numeric digit)
The formula that we will use to extract the numbers from cell A2 is as follows:
=SUBSTITUTE(A2,LEFT(A2,MIN(IFERROR(FIND({0,1,2,3,4,5,6,7,8,9},A2),""))-1),"")
Let us break down this formula to understand it better. We will go from the inner functions to the outer functions:
- FIND({0,1,2,3,4,5,6,7,8,9},A2)
This function tries to find the positions of all the numbers (0-9) in the cell A2. Thus it returns an arrayformula:
{#VALUE!,#VALUE!,#VALUE!,#VALUE!,#VALUE!,#VALUE!,9,7,8,#VALUE!}
It returns a #VALUE! error for all digits except the 7th, 8th, and 9th digits because it was not able to find these numbers (0,1,2,3,4,5,9) in cell A2. It simply returns the positions of numbers 6,7 and 8 which are the 9th, 7th, and 8th characters respectively in A2.
- IFERROR(FIND({0,1,2,3,4,5,6,7,8,9},A2),””)
Next, the IFERROR function replaces all the error elements of the array with a blank (“”). As such it returns the arrayformula:
{“”,””,””,””,””,””,9,7,8,””}
- MIN(IFERROR(FIND({0,1,2,3,4,5,6,7,8,9},A2),””))
After this the MIN function finds the array element with least value. This is basically the position of the first numeric character in A2. The function now returns the value 7.
- LEFT(A2,MIN(IFERROR(FIND({0,1,2,3,4,5,6,7,8,9},A2),””))-1)
At this point we want to extract all text characters from A2 (so that we can remove them). So we want the LEFT function to extract all the characters starting backwards from the 7-1= 6th character. Thus we use the above formula. The result we get at this point is “arnold”. Notice we subtracted 1 from the second parameter of the LEFT function.
- SUBSTITUTE(A2,LEFT(A2,MIN(IFERROR(FIND({0,1,2,3,4,5,6,7,8,9},A2),””))-1),””)
Now all that’s left to do is remove the string obtained, by replacing it with a blank. This can be easily achieved by using the SUBSTITUTE function: We finally get the numeric characters in the mixed text, which is “786”.
Once you are done entering the formula, make sure you press CTRL+SHIFT+Enter, instead of just the return key. This is because the formula involves arrays.
In a nutshell, here’s what’s happening when you break down the formula:
=SUBSTITUTE(A2,LEFT(A2,MIN(IFERROR(FIND({0,1,2,3,4,5,6,7,8,9},A2),””))-1),””)
=SUBSTITUTE(A2,LEFT(A2,MIN(IFERROR({{#VALUE!,#VALUE!,#VALUE!,#VALUE!,#VALUE!,#VALUE!,9,7,8,#VALUE!}),””))-1),””)
=SUBSTITUTE(A2,LEFT(A2,MIN({“”,””,””,””,””,””,9,7,8,””}))-1),””)
=SUBSTITUTE(A2,”arnold”,””)
=786
When you drag the formula down to the rest of the cells, here’s the result you get:
Also read: How to Generate Random Letters in Excel?
Extracting a Number from Mixed Text when Number is Always at the Beginning of the Text
Now let us consider the case where the numbers are always at the beginning of the Text.
Consider the following example:
Here, each cell has a mix of text and numbers, with the number always appearing at the beginning of the text. In such cases, we will again need to use a combination of nested Excel functions to extract the numbers.
In addition to the functions we used in the previous formula, we are going to use two additional functions. These are:
- MAX – This function returns the largest value in a list
- RIGHT – This function extracts a given number of characters from the right side of a string
- LEN – This function finds the length of (number of characters in) a given string.
Essentially, we will be using these functions altogether to perform the following sequence of tasks:
- Find the position of the last numeric value in the given cell
- Extract and remove the text part of the given cell (by removing everything to the right of the last numeric digit)
The formula that we will use to extract the numbers from cell A2 is as follows:
=SUBSTITUTE(A2,RIGHT(A2,LEN(A2)-MAX(IFERROR(FIND({0,1,2,3,4,5,6,7,8,9},A2),""))),"")
Let us break down this formula to understand it better. We will go from the inner functions to the outer functions:
- FIND({0,1,2,3,4,5,6,7,8,9},A2)
This function tries to find the positions of all the numbers (0-9) in the cell A2. Thus it returns an arrayformula:
{#VALUE!,#VALUE!,1,#VALUE!,#VALUE!,2,#VALUE!,#VALUE!,#VALUE!,#VALUE!}
It returns a #VALUE! error for all digits except the 3rd and 6th digits because it was not able to find these numbers (0,1,3,4,6,7,8,9) in cell A2. It simply returns the positions of numbers 2 and 5 which are the 1st and 2nd characters respectively in A2.
- IFERROR(FIND({0,1,2,3,4,5,6,7,8,9},A2),””)
Next, the IFERROR function replaces all the error elements of the array with a blank. As such it returns the array formula:
{“”,””,1,””,””,2,””,””,””,””}
- MAX(IFERROR(FIND({0,1,2,3,4,5,6,7,8,9},A2),””))
After this the MAX function finds the array element with the highest value. This is basically the position of the last numeric character in A2. The function now returns the value 2.
- LEN(A2)-MAX(IFERROR(FIND({0,1,2,3,4,5,6,7,8,9},A2),””))
At this point we want to extract all text characters from A2 (so that we can remove them). We need to specify how many characters we want to remove. This is obtained by computing the length of the string in A2 minus the position of the last numeric value.Thus we will get the value 8-2 = “6”.
- RIGHT(A2,LEN(A2)-MAX(IFERROR(FIND({0,1,2,3,4,5,6,7,8,9},A2),””)))
We can now use the RIGHT function to extract 6 characters starting from the 2nd character onwards. The result we get at this point is “arnold”.
- SUBSTITUTE(A2,RIGHT(A2,LEN(A2)-MAX(IFERROR(FIND({0,1,2,3,4,5,6,7,8,9},A2),””))),””)
Now all that’s left to do is remove this string by replacing it with a blank. This is easily achieved by using the SUBSTITUTE function. We finally get the numeric characters in the mixed text, which is “25”.
Again, once you are done entering the formula, don’t forget to press CTRL+SHIFT+Enter, instead of just the return key.
In a nutshell, here’s what’s happening when you break down the formula:
=SUBSTITUTE(A7,RIGHT(A7,LEN(A7)-MAX(IFERROR(FIND({0,1,2,3,4,5,6,7,8,9},A7),””))),””)
=SUBSTITUTE(A7,RIGHT(A7,LEN(A7)-MAX(IFERROR({#VALUE!,#VALUE!,1,#VALUE!,#VALUE!,2,#VALUE!,#VALUE!,#VALUE!,#VALUE!}
),””))),””)
=SUBSTITUTE(A7,RIGHT(A7,LEN(A7)-MAX({“”,””,1,””,””,2,””,””,””,””}
)),””)
=SUBSTITUTE(A7,RIGHT(A7,LEN(A7)-2),””)
=SUBSTITUTE(A7,RIGHT(A7,8-2),””)
=SUBSTITUTE(A7,”arnold”,””)
=25
When you drag the formula down to the rest of the cells, here’s the result you get:
Extracting a Number from Mixed Text when Number can be Anywhere in the Text
Finally let us consider the case where the numbers can be anywhere in the text, be it the beginning, end or middle part of the text.
Let’s take a look at the following example:
Here, each cell has a mix of text and numbers, where the number appears in any part of the text. In such cases, we will need to use a combination of nested Excel functions to extract the numbers.
Here are the functions that we are going to use this time:
- INDIRECT – This function simply returns a reference to a range of values.
- ROW – This function returns a row number of a reference.
- MID – This function extracts a given number of characters from the middle of a string.
- TEXTJOIN – This function combines text from multiple ranges or strings using a specified delimiter between them.
Note that the formula discussed in this section will work only in Excel version 2016 onwards as it uses the newly introduced TEXTJOIN function. If you are using an older version of Excel, then you may consider using the VBA method instead (discussed in the next section).
In this method, we will essentially be using the above functions altogether to perform the following sequence of tasks:
- Break up the given text into an array of individual characters
- Find out and remove all characters that are not numbers
- Combine the remaining characters into a full number
The formula that we will use to extract the numbers from cell A2 is as follows:
=TEXTJOIN("",TRUE,IFERROR(MID(A2,ROW(INDIRECT("1:"&LEN(A2))),1)*1,""))
Let us break down this formula to understand it better. We will go from the inner functions to the outer functions:
- LEN(A2)
This function finds the length of the string in cell A2. In our case it returns 13.
- INDIRECT(“1:”&LEN(A2))
This function just returns a reference to all the rows from row 1 to row 12.
- ROW(INDIRECT(“1:”&LEN(A2)))
Now this function returns the row numbers of each of these rows. In other words, it simply returns an array of numbers 1 to 12. This function thus returns the array:
{1;2;3;4;5;6;7;8;9;10;11;12;13}
Note: We use the ROW() function instead of simply hard-coding an array of numbers 1 to 12 because we want to be able to customize this formula depending on the size of the string being worked on. This will ensure that the function adjusts itself when copied to another cell. ·
- MID(A2,ROW(INDIRECT(“1:”&LEN(A2))),1)
Next the MID function extracts the character from A2 that corresponds to each position specified in the array. In other words, it returns an array containing each character of the text in A2 as a separate element, as follows:
{“a”;”r”;”n”;”o”;”l”;”d”;”1″;”4″;”3″;”b”;”l”;”u”;”e”}
- IFERROR(MID(A2,ROW(INDIRECT(“1:”&LEN(A2))),1)*1,””)
After this, the IFERROR function replaces all the elements of the array that are non-numeric. This is because it checks if the array element can be multiplied by 1. If the element is a number, it can easily be multiplied to return the same number. But if the element is a non-numeric character, then it cannot be multiplied with 1 and thus returns an error.
The IFERROR function here specifies that if an element gives an error on multiplication, the result returned will be blank. As such it returns the arrayformula:
{“”;””;””;””;””;””;1;4;3;””;””;””;””}
- TEXTJOIN(“”,TRUE,IFERROR(MID(A2,ROW(INDIRECT(“1:”&LEN(A2))),1)*1,””))
Finally, we can simply combine the array elements together using the TEXTJOIN function. The TEXTJOIN function here combines the string characters that remain (which are the numbers only) and ignores the empty string characters. We finally get the numeric characters in the mixed text, which is “143”.
Once you are done entering the formula, don’t forget to press CTRL+SHIFT+Enter, instead of just the return key.
In a nutshell, here’s what’s happening when you break down the formula:
=TEXTJOIN(“”,TRUE,IFERROR(MID(A2,ROW(INDIRECT(“1:”&LEN(A2))),1)*1,””))
=TEXTJOIN(“”,TRUE,IFERROR(MID(A2,ROW(INDIRECT(“1:”&13)),1)*1,””))
=TEXTJOIN(“”,TRUE,IFERROR(MID(A2,{1;2;3;4;5;6;7;8;9;10;11;12;13},1)*1,””))
=TEXTJOIN(“”,TRUE,IFERROR({“a”;”r”;”n”;”o”;”l”;”d”;”1″;”4″;”3″;”b”;”l”;”u”;”e”}
*1,””))
=TEXTJOIN(“”,TRUE,IFERROR(MID(A2,{1;2;3;4;5;6;7;8;9;10;11;12;13},1)*1,””))
=TEXTJOIN(“”,TRUE, {“”;””;””;””;””;””;1;4;3;””;””;””;””})
=143
When you drag the formula down to the rest of the cells, here’s the result you get:
Using VBA to Extract Number from Mixed Text in Excel
The above method works well enough in extracting numbers from anywhere in a mixed text.
However, it requires one to use the TEXTJOIN function, which is not available in older Excel versions (versions before Excel 2016).
If you’re on a version of Excel that does not support TEXTJOIN, then you can, instead, use a snippet of VBA code to get the job done.
If you have never used VBA before, don’t worry. All you need to do is copy the code below into your VBA developer window and run it on your worksheet data.
Here’s the code that you will need to copy:
'Code by Steve Scott from spreadsheetplanet.com
Function ExtractNumbers(CellRef As String)
Dim StringLength As Integer
StringLength = Len(CellRef)
For i = 1 To StringLength
If (IsNumeric(Mid(CellRef, i, 1))) Then Result = Result & Mid(CellRef, i, 1)
Next i
ExtractNumbers = Result
End Function
The above code creates a user-defined function called ExtractNumbers() that you can use in your worksheet to extract numbers from mixed text in any cell.
To get to your VBA developer window, follow these steps:
- From the Developer tab, select Visual Basic.
- Once your VBA window opens, Click Insert->Module. That’s it, you can now start coding.
Type or copy-paste the above lines of code into the module window. Your code is now ready to run.
Now, whenever you want to extract numbers from a cell, simply type the name of the function, passing the cell reference as a parameter. So to extract numbers from a cell A2, you will simply need to type the function as follows in a cell:
=ExtractNumbers(A2)
Explanation of the Code
Now let us take some time to understand how this code works.
- In this code, we defined a function named ExtractNumbers, that takes the string in the cell we want to work on. We assigned the name CellRef to this string.
Function ExtractNumbers(CellRef As String)
- We created a variable named StringLength, that will hold the length of the string, CellRef.
Dim StringLength As Integer
StringLength = Len(CellRef)
- Next, we loop through each character in the string CellRef and find out if it is a number. We use the function Mid(CellRef, i, 1) to extract a character from the string at each iteration of the loop. We also use the IsNumeric() function to find out if the extracted character is a number. Each extracted numeric character is combined together into a string called Result.
For i = 1 To StringLength
If (IsNumeric(Mid(CellRef, i, 1))) Then Result = Result & Mid(CellRef, i, 1)
Next i
- This Result is then returned by the function.
ExtractNumbers = Result
Note that since the workbook now has VBA code in it, you need to save it with .xls or .xlsm extension.
You can also choose to save this to your Personal Macro Workbook, if you think you will be needing to run this code a lot. This will allow you to run the code on any Excel workbook of yours.
Well, that was a lot!
In this tutorial, we showed you how to extract numbers from mixed text in excel.
We saw three cases where the numbers are situated in different parts of the text. We also showed you how to use VBA to get the task done quickly.
Other Excel articles you may also like:
- How to Extract URL from Hyperlinks in Excel (Using VBA Formula)
- How to Reverse a Text String in Excel (Using Formula & VBA)
- How to Add Text to the Beginning or End of all Cells in Excel
- How to Remove Text after a Specific Character in Excel (3 Easy Methods)
- How to Separate Address in Excel?
- How to Extract Text After Space Character in Excel?
- How to Find the Last Space in Text String in Excel?
- How to Remove Space before Text in Excel
Watch Video – How to Extract Numbers from text String in Excel (Using Formula and VBA)
There is no inbuilt function in Excel to extract the numbers from a string in a cell (or vice versa – remove the numeric part and extract the text part from an alphanumeric string).
However, this can be done using a cocktail of Excel functions or some simple VBA code.
Let me first show you what I am talking about.
Suppose you have a data set as shown below and you want to extract the numbers from the string (as shown below):
The method you choose will also depend on the version of Excel you’re using:
- For versions prior to Excel 2016, you need to use slightly longer formulas
- For Excel 2016, you can use the newly introduced TEXTJOIN function
- VBA method can be used in all the versions of Excel
Click here to download the example file
Extract Numbers from String in Excel (Formula for Excel 2016)
This formula will work only in Excel 2016 as it uses the newly introduced TEXTJOIN function.
Also, this formula can extract the numbers that are at the beginning, end or middle of the text string.
Note that the TEXTJOIN formula covered in this section would give you all the numeric characters together. For example, if the text is “The price of 10 tickets is USD 200”, it will give you 10200 as the result.
Suppose you have the dataset as shown below and you want to extract the numbers from the strings in each cell:
Below is the formula that will give you numeric part from a string in Excel.
=TEXTJOIN("",TRUE,IFERROR((MID(A2,ROW(INDIRECT("1:"&LEN(A2))),1)*1),""))
This is an array formula, so you need to use ‘Control + Shift + Enter‘ instead of using Enter.
In case there are no numbers in the text string, this formula would return a blank (empty string).
How does this formula work?
Let me break this formula and try and explain how it works:
- ROW(INDIRECT(“1:”&LEN(A2))) – this part of the formula would give a series of numbers starting from one. The LEN function in the formula returns the total number of characters in the string. In the case of “The cost is USD 100”, it will return 19. The formulas would thus become ROW(INDIRECT(“1:19”). The ROW function will then return a series of numbers – {1;2;3;4;5;6;7;8;9;10;11;12;13;14;15;16;17;18;19}
- (MID(A2,ROW(INDIRECT(“1:”&LEN(A2))),1)*1) – This part of the formula would return an array of #VALUE! errors or numbers based on the string. All the text characters in the string become #VALUE! errors and all numerical values stay as-is. This happens as we have multiplied the MID function with 1.
- IFERROR((MID(A2,ROW(INDIRECT(“1:”&LEN(A2))),1)*1),””) – When IFERROR function is used, it would remove all the #VALUE! errors and only the numbers would remain. The output of this part would look like this – {“”;””;””;””;””;””;””;””;””;””;””;””;””;””;””;””;1;0;0}
- =TEXTJOIN(“”,TRUE,IFERROR((MID(A2,ROW(INDIRECT(“1:”&LEN(A2))),1)*1),””)) – The TEXTJOIN function now simply combines the string characters that remains (which are the numbers only) and ignores the empty string.
Pro Tip: If you want to check the output of a part of the formula, select the cell, press F2 to get into the edit mode, select the part of the formula for which you want the output and press F9. You will instantly see the result. And then remember to either press Control + Z or hit the Escape key. DO NOT hit the enter key.
Download the Example File
You can also use the same logic to extract the text part from an alphanumeric string. Below is the formula that would get the text part from the string:
=TEXTJOIN("",TRUE,IF(ISERROR(MID(A2,ROW(INDIRECT("1:"&LEN(A2))),1)*1),MID(A2,ROW(INDIRECT("1:"&LEN(A2))),1),""))
A minor change in this formula is that IF function is used to check if the array we get from MID function are errors or not. If it’s an error, it keeps the value else it replaces it with a blank.
Then TEXTJOIN is used to combine all the text characters.
Caution: While this formula works great, it uses a volatile function (the INDIRECT function). This means that in case you use this with a huge dataset, it may take some time to give you the results. It’s best to create a backup before you use this formula in Excel.
Extract Numbers from String in Excel (for Excel 2013/2010/2007)
If you have Excel 2013. 2010. or 2007, you can not use the TEXTJOIN formula, so you will have to use a complicated formula to get this done.
Suppose you have a dataset as shown below and you want to extract all the numbers in the string in each cell.
The below formula will get this done:
=IF(SUM(LEN(A2)-LEN(SUBSTITUTE(A2, {"0","1","2","3","4","5","6","7","8","9"}, "")))>0, SUMPRODUCT(MID(0&A2, LARGE(INDEX(ISNUMBER(--MID(A2,ROW(INDIRECT("$1:$"&LEN(A2))),1))* ROW(INDIRECT("$1:$"&LEN(A2))),0), ROW(INDIRECT("$1:$"&LEN(A2))))+1,1) * 10^ROW(INDIRECT("$1:$"&LEN(A2)))/10),"")
In case there is no number in the text string, this formula would return blank (empty string).
Although this is an array formula, you don’t need to use ‘Control-Shift-Enter’ to use this. A simple enter works for this formula.
Credit to this formula goes to the amazing Mr. Excel forum.
Again, this formula will extract all the numbers in the string no matter the position. For example, if the text is “The price of 10 tickets is USD 200”, it will give you 10200 as the result.
Caution: While this formula works great, it uses a volatile function (the INDIRECT function). This means that in case you use this with a huge dataset, it may take some time to give you the results. It’s best to create a backup before you use this formula in Excel.
Separate Text and Numbers in Excel Using VBA
If separating text and numbers (or extracting numbers from the text) is something you have to often, you can also use the VBA method.
All you need to do is use a simple VBA code to create a custom User Defined Function (UDF) in Excel, and then instead of using long and complicated formulas, use that VBA formula.
Let me show you how to create two formulas in VBA – one to extract numbers and one to extract text from a string.
Extract Numbers from String in Excel (using VBA)
In this part, I will show you how to create the custom function to get only the numeric part from a string.
Below is the VBA code we will use to create this custom function:
Function GetNumeric(CellRef As String) Dim StringLength As Integer StringLength = Len(CellRef) For i = 1 To StringLength If IsNumeric(Mid(CellRef, i, 1)) Then Result = Result & Mid(CellRef, i, 1) Next i GetNumeric = Result End Function
Here are the steps to create this function and then use it in the worksheet:
Now, you will be able to use the GetText function in the worksheet. Since we have done all the heavy lifting in the code itself, all you need to do is use the formula =GetNumeric(A2).
This will instantly give you only the numeric part of the string.
Note that since the workbook now has VBA code in it, you need to save it with .xls or .xlsm extension.
Download the Example File
In case you have to use this formula often, you can also save this to your Personal Macro Workbook. This will allow you to use this custom formula in any of the Excel workbooks that you work with.
Extract Text from a String in Excel (using VBA)
In this part, I will show you how to create the custom function to get only the text part from a string.
Below is the VBA code we will use to create this custom function:
Function GetText(CellRef As String) Dim StringLength As Integer StringLength = Len(CellRef) For i = 1 To StringLength If Not (IsNumeric(Mid(CellRef, i, 1))) Then Result = Result & Mid(CellRef, i, 1) Next i GetText = Result End Function
Here are the steps to create this function and then use it in the worksheet:
Now, you will be able to use the GetNumeric function in the worksheet. Since we have done all the heavy lifting in the code itself, all you need to do is use the formula =GetText(A2).
This will instantly give you only the numeric part of the string.
Note that since the workbook now has VBA code in it, you need to save it with .xls or .xlsm extension.
In case you have to use this formula often, you can also save this to your Personal Macro Workbook. This will allow you to use this custom formula in any of the Excel workbooks that you work with.
In case you’re using Excel 2013 or prior versions and don’t have
You May Also Like the Following Excel Tutorials:
- CONCATENATE Excel Ranges (with and without separator).
- A Beginner’s Guide to Using For Next Loop in Excel VBA.
- Using Text to Columns in Excel.
- How to Extract a Substring in Excel (Using TEXT Formulas).
- Excel Macro Examples for VBA Beginners.
- Separate Text and Numbers in Excel
Splitting single-cell values into multiple cells, and collating multiple cell values into one, is the part of data manipulation. With the help of the text function in excelTEXT function in excel is a string function used to change a given input to the text provided in a specified number format. It is used when we large data sets from multiple users and the formats are different.read more “LEFT,” “MID,” and “RIGHT,” we can extract part of the selected text value or string value. To make the formula dynamic, we can use other supporting functions like “FIND” and “LEN.” However, the extraction of only numbers with the combination of alpha-numeric values requires an advanced level of formula knowledge.
Table of contents
- How to Extract Number from String in Excel?
- #1 – Extract Number from the String at the End in Excel?
- #2 – Extract Numbers From Right Side but Without Special Characters
- #3 – Extract Number From Any Position in Excel
- Recommended Articles
This article will show you the three ways to extract numbers from a string in Excel.
- #1 – Extract Number from the String at the End of the String
- #2 – Extract Numbers from Right Side but Without Special Characters
- #3 – Extract Numbers from any Position of the String
Below we have explained the different ways of extracting the numbers from strings in Excel. Read the whole article to learn this technique.
#1 – Extract the Number from the String at the End in Excel?
You can download this Extract Number from String Excel Template here – Extract Number from String Excel Template
When we get the data, it follows a certain pattern, and having all the numbers at the end of the string is one of the patterns.
- For example, the city with its zip code below is a sample of the same.
- We have the city name and zip code in the above example. In this case, we know we have to extract the zip code from the right-hand side of the string. But, we do not know exactly how many digits we need from the right-hand side of the string.
Before the numerical value starts, one of the common things is the underscore (_) character. So first, we need to identify the position of the underscore character. We can do it by using the FIND method. So, apply the FIND function in excel.
- What text do we need to find in the find_text argument? In this example, we need to find the position of the underscore, so enter the underscore in double quotes.
- The within_text is in which text we need to find the mentioned text, so select cell reference.
- The last argument is not required, so leave it as of now.
- So, we have got positions of underscore character for each cell. Next, we need to identify how many characters we have in the entire text. So, we must apply the LEN function in Excel to get the total length of the text value.
- Now, we have total characters and positions of underscore before the numerical value. Therefore, to supply the number of characters needed for the RIGHT function, we must minus the Total Characters with Underscore Position.
- Now, apply the RIGHT function in cell E2.
- So, like this, we can get the numbers from the right-hand side when we have a common letter before the number starts in the string value. Instead of having so many helper columns, we can apply the formula in a single cell.
=RIGHT(A2,LEN(A2)-FIND(“_”,A2))
It will eliminate all the supporting columns and reduce the time drastically.
#2 – Extract Numbers From Right Side but Without Special Characters
Assume we have the same data, but this time we do not have any special character before the numerical value.
We have found a special character position in the previous example, but we do not have that luxury here. So, the formula below will find the “Numerical Position.”
Please do not turn off your computer by looking at the formula. We will decode this for you.
For the SEARCH function in excelSearch function gives the position of a substring in a given string when we give a parameter of the position to search from. As a result, this formula requires three arguments. The first is the substring, the second is the string itself, and the last is the position to start the search.read more, we have supplied all the possible starting numbers of numbers, so the formula looks for the position of the numerical value. Since we have provided all the possible numbers to the array, the resulting arrays should contain the same numbers. Then, the MIN function in excelIn Excel, the MIN function is categorized as a statistical function. It finds and returns the minimum value from a given set of data/array.read more returns the smallest number among the two, so the formula reads below.
=MIN(SEARCH({0,1,2,3,4,5,6,7,8,9},A2&”0123456789″))
So, now we have the “Numerical Position.” But, first, let us find the total number of characters in the cell.
Consequently, it will return the total number of characters in the supplied cell value. Now, the “LEN” – position of the numerical value will return the number of characters required from the right side, so apply the formula to get the number of characters.
Now, we must apply the RIGHT function in excelRight function is a text function which gives the number of characters from the end from the string which is from right to left. For example, if we use this function as =RIGHT ( “ANAND”,2) this will give us ND as the result.read more to get only the numerical part from the string.
Let us combine the formula in a single cell to avoid multiple helper columns.
{=RIGHT(A2,LEN(A2)-MIN(SEARCH({0,1,2,3,4,5,6,7,8,9},A2&”0123456789″))+1)}
#3 – Extract Number From Any Position in Excel
We have seen from the RIGHT side extraction. But this is not the case with all the scenarios, so now we will see how to extract numbers from any string position in Excel.
For this, we need to employ various functions of Excel. Below is the formula to extract the numbers from any string position.
=IF(SUM(LEN(A2)-LEN(SUBSTITUTE(A2, {“0″,”1″,”2″,”3″,”4″,”5″,”6″,”7″,”8″,”9”}, “”)))>0, SUMPRODUCT(MID(0&A2, LARGE(INDEX(ISNUMBER(–MID(A2,ROW(INDIRECT(“$1:$”&LEN(A2))),1))* ROW(INDIRECT(“$1:$”&LEN(A2))),0), ROW(INDIRECT(“$1:$”&LEN(A2))))+1,1) * 10^ROW(INDIRECT(“$1:$”&LEN(A2)))/10),””)
Recommended Articles
This article is a guide to Extract Number From String in Excel. We discussed the top 3 easy methods, step-by-step examples, and a downloadable Excel template. You may learn more about Excel from the following articles: –
- MID Function in Excel
- REPLACE Function in Excel
- Substring in Excel
- VBA String Functions
-
Excel Howtos
Extracting numbers from text in excel [Case study]
-
Last updated on June 19, 2012
Chandoo
Often we deal with data where numbers are buried inside text and we need to extract them. Today morning I had such task. As you know, we recently ran a survey asking how much salary you make. We had 1800 responses to it so far. I took the data to Excel to analyze it. And surprise! the numbers are a mess. Here is a sample of the data.
Now, how do I extract the salary amounts from this without typing the values?
My first thought is to write a user defined function to extract the number from text. But I usually shy away from VBA. So I wanted to see if there is a formula based approach to extract the number from text.
Using formulas to extract number from text
To extract number from a text, we need to know 2 things:
- Starting position of the number in text
- Length of the number
For example, in text US $ 31330.00 the number starts at 6th letter and has a length of 8.
So, if we can write formulas to get 1 & 2, then we can combine them in MID formula to extract the number from text!
Finding the starting position of number in text
To find the starting position, we need to find the first character which is a number (0 to 9). In other words, if we can find the positions of 0 to 9 inside the given text, then the minimum of all such positions would be starting position.
Sounds complicated?!? Well, in that case look at the formula and then you will understand why this works.
Assuming the text is in A1 and the range lstNumbers contains 0 to 9, below formula finds starting position
{=MIN(IFERROR(FIND(lstNumbers,A1),””))}
You need to array enter it (CTRL+SHIFT+Enter)
How this formula works?
FIND(lstNumbers, A1) portion: This part finds where each of the numbers 0 to 9 occur in the text in A1. If a match is found, the position is returned. Else we get an error. For US $ 31330.00 the values would be,
{10;7;#VALUE!;6;#VALUE!;#VALUE!;#VALUE!;#VALUE!;#VALUE!;#VALUE!}
Meaning, 0 occurs at 10th position, 1 occurs at 7th position, 3 occurs at 6th position and everything else (2,4,5,6,7,8,9) do not occur in the number.
IFERROR(…,””) portion: Then, we replace errors with empty spaces so that MIN could work its magic.
At this stage, the result would be, {10;7;””;6;””;””;””;””;””;””}
Related: IFERROR Formula – syntax & examples
{=MIN(…)} portion: This would find the minimum of {10;7;””;6;””;””;””;””;””;””} which is 6. The starting position of number inside text.
Because we are finding multiple items, we need to array enter the formula to get correct result.
Finding the length of number
Once we find starting point, next we need to know the length of the number. There are many ways to do this. Depending on the variety in your input data, you can choose a technique that works best.
Approach 1 – counting number of digits in text
My first approach is to count number of digits in the text and use it as length. For this, we can break the text in to individual characters and then see if each of them is a number or not.
Assuming the text is in A1, the number of digits in it are,
=SUMPRODUCT(- -ISNUMBER(MID(A1,ROW($A$1:$A$200),1)+0))
MID(A1,ROW($A$1:$A$200),1) + 0 portion: This breaks the text in A1 in to individual characters (assumes the max length is 200) and then adds 0 to them.
At this stage, you have 200 values some of them numbers, others errors.
ISNUMBER(…) portion: This checks all the 200 values for numbers. After this, we will have 200 true or false values.
— ISNUMBER (…) portion: This converts the true, false values to 0s and 1s. (by double negating Excel will convert boolean values to number equivalents).
SUMPRODUCT(…) portion: This finally sums up all 1s thus giving us the number of digits in the text.
Does it work?
While this approach works well for some numbers, it fails in other cases. For example, a text like US $ 31330.00 has number portion with 8 characters (31330.00) where as our formula would say the length is 7 (because decimal point . is not a number and hence ISNUMBER() would give false for that).
So I had to move on to next approach.
Approach 2 – counting number of digits, commas & decimal points in text
The next approach is to count not only numbers, but also commas & decimal points in the text. For this, first I placed all the digits (0 to 9) and comma & decimal point in a range called as lstDigits.
Below formula counts how many of lstDigits are in text in A1.
=SUMPRODUCT(COUNTIF(lstDigits,MID(A1,ROW($A$1:$A$200),1)))
COUNTIF(lstDigits, MID(…)) portion: This checks how many times each of the 200 characters appear in lstDigits.
This would be an array of counts. For example {0;0;0;0;0;1;1;1;1;1;1;1;1;…} for US $ 31330.00, indicating that first 5 are not in lstDigits and then we have 8 in lstDigits.
SUMPRODUCT(…) portion: just sums all the numbers, hence we get length as 8.
Related: SUMPRODUCT Formula – examples & explanation
Extracting numbers from text
Once we have starting position of number & its length, we can combine them in a MID formula to extract the number. Here is the result for our sample data set.
As you can see, this method works well, but fails in some cases like,
- European number formats (, for decimal point and . for thousands)
- Text with multiple numbers
Fortunately, in my data set, we had only a few incidents like these. So I have decided to manually adjust them than work out even more complicated formula.
Using Macros to extract numbers from text
As you can guess, we can use a simple macro (or UDF) to extract numbers from a given text. We will learn how to do this next week.
Download Example Workbook
Click here to download example workbook with all these formulas. Examine the formulas to understand how you can extract numbers from text in Excel.
Often I deal with data like this. I use a mix of techniques. Apart from the one mentioned above I also use,
- getNumber() UDF to extract numbers from text (more on this next week)
- Use SUBSTITUTE to clear formatting (replace dots with empty spaces and commas with dots to convert from European format to standard format)
- Use VALUE to extract the number (works when number is shown as text)
- Use +0 to force convert numbers from text (works when number is shown as text)
What about you? How do you extract numbers from text? What are your favorite techniques? Please share using comments.
Tips on cleaning data using Excel
If you use Excel to clean data, go thru these articles to learn some powerful techniques.
- Clean up dates & convert text to dates
- Remove Duplicates using Excel using formulas or using Excel features
- Extract phone numbers from data
- Quickly fill blank cells with data
Share this tip with your colleagues
Get FREE Excel + Power BI Tips
Simple, fun and useful emails, once per week.
Learn & be awesome.
-
65 Comments -
Ask a question or say something… -
Tagged under
advanced excel, array formulas, cleanup data, downloads, Excel Howtos, find, iferror, Learn Excel, Microsoft Excel Formulas, MIN(), sumproduct
-
Category:
Excel Howtos
Welcome to Chandoo.org
Thank you so much for visiting. My aim is to make you awesome in Excel & Power BI. I do this by sharing videos, tips, examples and downloads on this website. There are more than 1,000 pages with all things Excel, Power BI, Dashboards & VBA here. Go ahead and spend few minutes to be AWESOME.
Read my story • FREE Excel tips book
Excel School made me great at work.
5/5
From simple to complex, there is a formula for every occasion. Check out the list now.
Calendars, invoices, trackers and much more. All free, fun and fantastic.
Power Query, Data model, DAX, Filters, Slicers, Conditional formats and beautiful charts. It’s all here.
Still on fence about Power BI? In this getting started guide, learn what is Power BI, how to get it and how to create your first report from scratch.
- Excel for beginners
- Advanced Excel Skills
- Excel Dashboards
- Complete guide to Pivot Tables
- Top 10 Excel Formulas
- Excel Shortcuts
- #Awesome Budget vs. Actual Chart
- 40+ VBA Examples
Related Tips
65 Responses to “Extracting numbers from text in excel [Case study]”
-
Gregor Erbach says:
I have learned one important lesson in my career: don’t use Excel as a word processor!
Excel is very ill-suited to handling tasks like pattern matching in text (regular expressions).
For example, one could use a Perl script (from the CMD command line) to extract numbers from the text, something like
perl -n -e «m/([d,.]*)/; print «$1n» IN > OUT
The point I am trying to make is, if text processing is a substantial part of your tasks, it is worth the effort learning a tool that is more suited to the task.-
I agree with what your saying about suitability of tools to task,
but seriously perl -n -e “m/([d,.]*)/; print “$1n” IN > OUT
Doesn’t just roll off your tongue
and very few PC’s would even have Perl installed
-
Interesting point. I would have used VBA or some variation of that to clean data. But as Hui says Perl is not something many (including me) have on our computers. Plus I suck at regular expressions. I have tried long and hard to learn them, but I guess my mind is not wired to understand how they work 🙁
-
Jim says:
The VBA engine for regular expressions works fine in Excel and Access. I agree if you are doing a substantial amount of text cleaning then doing so exclusively in Excel is probably not the best course, but using regular expressions in Perl or any other flavor doesn’t really provide any advantage over using regex in Excel. At the end of the day, regex is regex (with the exception of minor syntactic differences).
Chandoo, a lot of people are intimidated by regex just like people are intimidated by formulas or macros. Once you learn a handful of rules and concepts, you can really do some damage with regex, and not just for data cleaning. You can validate input, parse, transform (e.g. firstname lastname —> lastname, firstname), etc. Once you pick it up you’ll see many opportunities where it’s useful and sometimes superior to another approach (and sometimes not). My brain is not wired to understand multiple nested formulas or complicated array formulas so I guess regex is just easier to me. I would add that the regular expression approach not only cleans this data, but is flexible enough to account for other variations…try it on the other records you have and see how well it performs. Either way, it’s great to know there are multiple approaches to solving the problem!
-
-
Bhavani Seetal Lal says:
How to sign inn into chandoo.org???
-
Vijay Kumar says:
Dear Chandoo…
Awesome, you are such a genius, how can you think like «what the excel will think»? Such ideas implementation…Very great . But lastly, i just wanted to know that in your formula =SUMPRODUCT(COUNTIF(lstDigits,MID(B4,ROW($A$1:$A$200),1))), why we can’t use 1 on behalf of ROW($A$1:$A$200), obviously if we will input this formula somewhere in excel, it results 1. So what is the logic to use row function instead of simple 1.
Please explain, i am eager to learn the logic of inputting or building a formulas. Can you suggest me the source like web url or books to build better formulas in our excel work
-
Luke M says:
Within SUMPRODUCT, that bit of formula is actually producing an array of numbers from 1 to 200, eg
{1,2,3,4,…200}
To see this in the formula, highlight that portion within the SUMPRODUCT and hit F9.
-
-
In the first example above, I would store lstNumbers as a Named Formula:
lstNumbers ={0;1;2;3;4;5;6;7;8;9}
-
I do not know why you would not want to use a UDF — Here is the code for one I wrote that extracts Numbers, Letters, Commas and spaces to clean up data — amend as you wish for ANY ASCII characters
Function NumbersAndLettersOnly(instring) As String
Dim StringLength As Integer ‘to hold string length
Dim i As Integer ‘counter for loop
Dim AsciiVal ‘to hold working character
Dim WorkingString As String ‘to build output string
instring = Trim(instring) ‘Drop leading & trailing spaces
StringLength = Len(instring) ‘Count number of characters in string
For i = 1 To StringLength ‘Loop thru each character in the string
AsciiVal = Asc(Mid(instring, i, 1))
If AsciiVal >= 48 And AsciiVal <= 57 Then ‘Numbers 0-9
WorkingString = WorkingString & Chr(AsciiVal)
ElseIf AsciiVal >= 65 And AsciiVal <= 90 Then ‘A-Z
WorkingString = WorkingString & Chr(AsciiVal)
ElseIf AsciiVal >= 97 And AsciiVal <= 122 Then ‘a-z
WorkingString = WorkingString & Chr(AsciiVal)
ElseIf AsciiVal = 46 Then ‘.
WorkingString = WorkingString & Chr(AsciiVal)
ElseIf AsciiVal = 32 Then ‘{space}
WorkingString = WorkingString & Chr(AsciiVal)
End If
Next i
NumbersAndLettersOnly = WorkingString ‘Return output to function
End Function-
@Cliff,
In case you might be interested, here is a shorter UDF which does the same thing as the one you posted…
[code]
Function AlphaNumerics(ByVal InString) As String
Dim X As Integer
For X = 1 To Len(InString)
If Mid(InString, X, 1) Like «[!A-Za-z0-9. ]» Then Mid(InString, X, 1) = Chr$(1)
Next
AlphaNumerics = Replace(InString, Chr$(1), «»)
End Function
[/code]-
Kevin says:
I used this one, because it was the smallest but most readable (to me).
The quotes didn’t translate correctly into VBA when I cut and pasted, so I just deleted them and typed them in again. Also, this function does the opposite: it finds the search characters in the 1st set of quotes, and then substitutes a null character. In my case, I wanted the opposite, so I just added a NOT in front of Mid. Thanks Rick. 🙂
-
-
-
Hi Purna,
I had a similar request earlier this week. Here’s the ‘quick & dirty’ solution I used. Keep in mind that, if there are a lot of different characters and text you need to remove from your text, this can get a little cumbersome. But for removing a few different text strings and characters from the same cell, this is pretty simple to understand.Using your example, to remove the space character , INR, and Rs, I would use this formula…
=SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(B4,» «,»»),»INR»,»»),»Rs»,»»)
You start with =SUBSTITUTE(B4,» «,»») and then wrap it in another SUBSTITUTE for each text string you want to remove.
-
Elias says:
One more option that works withh all Excel versions.
=1*
MID(A1,
MIN(FIND({0,1,2,3,4,5,6,7,8,9},A1&»0123456789″)),
FIND(» «,A1&» «,MIN(FIND({0,1,2,3,4,5,6,7,8,9},A1&»0123456789»)))
)Confirm with Cntrl+Shift+Enter
Regards
-
Elias says:
Other more with different approach.
=1*
MID(A1,
MATCH(1,1*ISNUMBER(1*MID(A1,ROW($1:$255),1)),0),
MATCH(10,1*MID(A1,ROW($1:$255),1))
)Ctrl+Shift+Enter
Regards
-
Isaac says:
You just saved me from giving up haha. I tried many formulas that didn’t work, this one did it!!! Thank You very much
-
@Isaac
Never give up
The Chandoo.org Forums are a great source of asking «How do I …» questions and getting a result real quick
http://chandoo.org/forums/
-
-
-
Okay, one more method. The following was posted some time ago by Lars-Åke Aspelin in some forum or newsgroup…
=MID(SUMPRODUCT(—MID(«01″&A1,SMALL((ROW($1:$300)-1)*ISNUMBER(-MID(«01″&A1,ROW($1:$300),1)),ROW($1:$300))+1,1),10^(300-ROW($1:$300))),2,300)
This is an array formula and has to be confirmed with CTRL+SHIFT+ENTER rather than just ENTER.
It has the following (known) limitations:
— The input string in cell A1 must be shorter than 300 characters
— There must be at most 14 digits in the input string.
(Following digits will be shown as zeroes.)
Maybe of no practical use, but it will also handle the following two cases correctly:
— a «0» as the first digit in the input will be shown correctly in the output
— an input without any digits at all will give the empty string as output (rather than 0). -
Jim says:
I would probably use regular expressions in this case. The functions are:
************************************************
Function RegExReplace(ReplaceIn, _
ReplaceWhat As String, ReplaceWith As String, Optional IgnoreCase As Boolean = False)
Dim RE As Object
Set RE = CreateObject(«vbscript.regexp»)
RE.IgnoreCase = IgnoreCase
RE.Pattern = ReplaceWhat
RE.Global = True
RegExReplace = RE.Replace(ReplaceIn, ReplaceWith)
End FunctionFunction RegExFind(FindIn, FindWhat As String, _
Optional IgnoreCase As Boolean = False)
Dim i As Long
Dim matchCount As Integer
Dim RE As Object, allMatches As Object, aMatch As Object
Set RE = CreateObject(«vbscript.regexp»)
RE.Pattern = FindWhat
RE.IgnoreCase = IgnoreCase
RE.Global = True
Set allMatches = RE.Execute(FindIn)
matchCount = allMatches.Count
If matchCount >= 1 Then
ReDim rslt(0 To allMatches.Count — 1)
For i = 0 To allMatches.Count — 1
rslt(i) = allMatches(i).Value
Next i
RegExFind = rslt
Else
RegExFind = «»
End If
End Function************************************************
and the usage is:
=RegExFind(B4,»d{2,7}(.)?d{2,7}(.d{2}|,0{1,2})?»)
The expression d{2,7}(.)?d{2,7}(.d{2}|,0{1,2})? is a little twisted, but these data are pretty dirty, too! Just using this expression: d+ would match all but two of the sample records.
Cheers,
-Jim -
Godsbod says:
The obvious solution would have been to make a more structured questionnaire, with a seperate value field for currency and amount…
but the challenge now arisiing is almost a worthy by-product…
-
-
Elias says:
@ Juan,
That formula doesn’t work in this case.
Regards
-
-
Carl says:
I think the bigger problem is the one created by separating the numbers from the letters. The goal was to figure out what people are making in their excel jobs, and now we have numbers but no reference to the currency, so you can’t average these numbers to figure out how much the average excel-wise employee is making. We’d have to go back and determine the currency and calculate the exchange rate. I think @Godsbod had it right. The more important lesson is to think through the questionnaire in order to get more manageable results, which is what we really want.
-
[…] week we discussed how to extract numbers from text in Excel using formulas. In comments, quite a few people suggested that using VBA (Macros) to […]
-
r says:
@Elias
your formulas are very beautiful … But I do not understand why to use a matrix of constants!
use ROW(1:10)-1 and 1234567890 … better!
Also your formulas do not work in cases similar to this ABCD 1234 EFGH … You need to make small changes … Meanwhile I propose this is shorter and also works in these cases:
=LEFT(REPLACE(A1,1,MIN(FIND(ROW(1:10)-1,A1&57321^2))-1,),FIND(» «,REPLACE(A1,1,MIN(FIND(ROW(1:10)-1,A1&57321^2))-1,)&» «)-1)
ah … 57321 is a pandigital number 🙂
regards
r-
Elias says:
@r nice to see you around.
Other than the number of characters I don’t see any advantages using the row option over the array of constants. However, using constants avoid the use of array enter and the formulas don’t get mess if you insert or delete rows.
Regards
-
-
r says:
@elias
you’re right but I can not look at the formula
is only an aesthetic factor … impossible to say «beautiful» by a matrix of constants
regards
r
-
r says:
@elias
i have changed your formula … so it work with ABCD 1234 EFGH:
=—MID(LEFT(A1,FIND(» «,A1&» «,MIN(FIND(ROW(1:10)-1,A1&57321^2)))-1),MIN(FIND(ROW(1:10)-1,A1&57321^2)),20)
what do you think?
regards
r
-
Elias says:
@r
I like that one, but it doesn’t work with cases like this ABCD12345EFGH. So, what about this one?
=—LOOKUP(9.9E+307,—LEFT(MID(A1,MIN(FIND(ROW(1:10)-1,A1&57321^2)),255),ROW(1:255)))
Regards
-
Jim says:
My RegEx solution above handles both ABCD 1234 EFGH and ABCD12345EFGH.
Cheers,
-Jim
-
-
-
r says:
oh @elias … fantastic! i like it!
and, why not so:
=-LOOKUP(0,-LEFT(MID(A1,MIN(FIND(ROW(1:10)-1,A1&57321^2)),255),ROW(1:255)))
🙂
r
-
r says:
ufff … so:
=-LOOKUP(0,-LEFT(MID(A1,MIN(FIND(ROW(1:10)-1,A1&57321^2)),255),ROW(1:255)))-
Elias says:
@r
Yea!!! I like the use of negative sign in 2 different places.Regards
-
bashful says:
Great! I was able to use this formula to extract the first number from HGVS (human genome variation society) nomenclature:
c.247_248delCG 247
p.Glu789_Arg790del 789
The c. or p. are sometimes omitted and the numbers can be of variable length (1-12 digits).
I would like to understand how this formula works. I basically know lookup, left, mid, right and find but this is over my head!
Thank you!
-
-
[…] Extract numbers from text using formulas […]
-
joel says:
Mr. Chandoo,
please ask you a help
I want to have a code for Excel 2010, to list on a sheet all macros from a workbook excel
please, help me in thisthank you very much
-
Raj says:
I want to separate in each column like 25 PF 105 E01 XXXX , from below examples , please help or write srinivasmr@rediffmail.com
25PF105E01XXXX
100SP105E02XXXX
-
Jim says:
Hi Raj,
Use the RegExReplace function I posted above and then use this:
=RegExReplace(A1,»(^d{2,3})(w{2})(d{3})(w{3}(w{4}))»,»$1 $2 $3 $5″)
where the text you want to parse is in cell A1.
And the result is:
25 PF 105 XXXX
100 SP 105 XXXXIt makes the following assumptions:
— The first part is at least 2 but nore than 3 digits
— The second part is 2 characters
— The third part consists of three digits
— The fourth part consists of 4 alpha-numeric charactersCheers,
-Jim
-
rao says:
Hi
could you tell me how to learn regex tell me websites to learn for me
my mail id: raoexcel@gmail.com
-
-
Rahim Zulfiqar Ali says:
=IF(ISERROR(MID(B11,FIND(«/»,B11,1)-2,10)),»»,(MID(B11,FIND(«/»,B11,1)-2,10)))
-
Santosh says:
Hi,
I have an issue where I have a strings of number and letter that are separted by comma. (Q123,125-129,QA123,127-129).
I would like extend the numbers that are inside the hyphen so that it becomes
(Q123,125,126,127,128,129,QA123,127,128,129)Can this be completed using just the formulas.
-
Vijay Verma says:
Check this out…This seems to be the right thing to do..
http://www.excelforum.com/excel-formulas-and-functions/653043-extracting-numbers-from-alphanumeric-strings.html
This says following —
Try thisA1: abc123def456ghi789
First, create a Named Formula
Names in Workbook: Seq
Refers to: =ROW(INDEX($1:$65536,1,1):INDEX($1:$65536,255,1))This ARRAY FORMULA removes ALL non-numerics from a string
B1: =SUM(IF(ISNUMBER(1/(MID(A1,seq,1)+1)),MID(A1,seq,1)*10^MMULT(-(seq<TRANSPOSE(seq)),-ISNUMBER(1/(MID(A1,seq,1)+1)))))In the example, the formula returns: 123456789
-
antonio says:
please, help
I need code for Excel 2010 for list all the macros of book excel, in one sheet. name of the macro and sheet of the macro
many thanks-
@Antonio
Try the following:
Sub ListMacros()
' Code thanx to Bob Philips
Const vbext_pk_Proc = 0
Dim VBComp As Object
Dim VBCodeMod As Object
Dim oListsheet As Object
Dim StartLine As Long
Dim ProcName As String
Dim iCount As IntegerApplication.ScreenUpdating = False
On Error Resume Next
Set oListsheet = ActiveWorkbook.Worksheets.Add
iCount = 1
oListsheet.Range("A1").Value = "Macro"For Each VBComp In ThisWorkbook.VBProject.VBComponents
Set VBCodeMod = ThisWorkbook.VBProject.VBComponents(VBComp.Name).CodeModule
With VBCodeMod
StartLine = .CountOfDeclarationLines + 1
Do Until StartLine >= .CountOfLines
oListsheet.[a1].Offset(iCount, 0).Value = .ProcOfLine(StartLine, vbext_pk_Proc)
iCount = iCount + 1
StartLine = StartLine + .ProcCountLines(.ProcOfLine(StartLine, vbext_pk_Proc), vbext_pk_Proc)
Loop
End With
Set VBCodeMod = Nothing
Next VBCompApplication.ScreenUpdating = True
End Sub
or have a look at: http://msdn.microsoft.com/en-us/library/office/dd890502%28v=office.11%29.aspx
-
antonio says:
Many thanks Huis
Excel 2010 give errors in the all the linesFor Each VBComp In ThisWorkbook.VBProject.VBComponents
Set VBCodeMod = ThisWorkbook.VBProject.VBComponents(VBComp.Name).CodeModule
With VBCodeMod
StartLine = .CountOfDeclarationLines + 1
Do Until StartLine >= .CountOfLines
oListsheet.[a1].Offset(iCount, 0).Value = .ProcOfLine(StartLine, vbext_pk_Proc)
iCount = iCount + 1
StartLine = StartLine + .ProcCountLines(.ProcOfLine(StartLine, vbext_pk_Proc), vbext_pk_Proc)
Loop
End With
Set VBCodeMod = Nothing
Next VBCompgreetings
-
-
antonio says:
Many thanks
I have tried but it did not give result for Excel 2010
This codes are for Excel 2007 or for VB, but it do not work in Excel 2010Regards
Antonio
-
-
-
-
-
Beth says:
With the data I had, I was able to use your first approach with great success. In my case I did not have any commas or decimals to worry about; I was simply trying to extract a simple account number from a string of letter and numbers. So, after creating the named range for lstNumbers, I was able to use the starting position formula, the number of characters formula, and then used a MID function to get the values I needed.
Thank you, thank you, thank you! 🙂
-
[…] The next example is from Chandoo […]
-
Jimmy Luong says:
This is great. Thanks a lot
-
I encountered exactly the same situation recently, here’s my approach to solve it.
=MID(A2,MIN(FIND({0,1,2,3,4,5,6,7,8,9},A2&»0123456789″)),COUNT(0+MID(A2,ROW(INDIRECT(«1:»&LEN(A2))),1)))+0
Ctrl Shift EnterNot working for all cases though, e.g. with more than 1 set of numbers
Here’s my post for sharing:
http://wmfexcel.wordpress.com/2014/09/06/how-excel-formula-can-save-your-time/ -
Umer says:
i am using formula
{=MIN(IFERROR(FIND(lstNumbers,A1),””))}
to extract 61.49 from 61.49 mg but it is giving
#value!please guide what is wrong
-
Thomas says:
Please help me with this problem
I have a data as below
IMR 123 , IMR456 , IMR #378 & IMR @890, MQC 123, QC234
Now there are many records as above row
I want only Number after IMR word
Example for above i need output as
123
456
378
890
All in different rows
-
-
jpcpa says:
Hi Sir. I just on how can I sn separate the name and TAx information number in a two columns given that data below.
«SAVER’S DIGITAL HUB APPLINCE DEPOT (1 Condura ) VAT 215-003-620
«»DIGITEL MOBILE PHIL. VMH SUN-0922-877-7139 VAT 215-398-626
«»DIGITEL MOBILE PHIL. VMH SUN-0922-877-7139 VAT 215-398-626
«»DAPO REST AND BAR VAT 217-078-338
«»WILCON BUILDER’S DEPOT INC. (1 pc. Vinyl Adhesive WaterBase 1 Gallon &
110 pc. Kent Vinyl WP Deluxe 6″»X36″» Natural Oak-YHT) VAT 221-252-819
«»WILCON BUILDERS DEPOT INC. VAT 221-252-819
«»AVEXSON CORPORATION VAT 221-764-997 «
«AVEXSON CORPORATION (for ZDJ 514) VAT 221-764-997
«»TOKYO TERIYAKI RESTAURANT INC. VAT 226-172-350
«»TRG AUTO GEM PARTS SUPPLY NV234-220-733»Thanks Sir.
-
@JCPCA
What was wrong with my previous answer?
Please ask the question in the Forums and attach a sample file
http://chandoo.org/forum/
-
-
Martin says:
Hi you commented that:
As you can see, this method works well, but fails in some cases like,
European number formats (, for decimal point and . for thousands)
Text with multiple numbersFortunately, in my data set, we had only a few incidents like these. So I have decided to manually adjust them than work out even more complicated formula.
This is my problem, almost all my data is in this format and I need a formula to sort out this very issue…
-
kimdom says:
I just cant get myself to think excel logic. Somebody help reading the number of days from a list like:
Nov15 PUTS (23 days)
March15 TIKS (3 days)
March1 TIKS (25 days)
June11 TIKS (10 days)So the reader should return the following:
23
3
25
10etc.
-
Hi Kimdom…
Try this (assuming the text value in A1)
=SUBSTITUTE(MID(A1, FIND(«(«,A1)+1,99),» days)»,»»)+0
-
kimdom says:
Oh yes! Great!
-
-
-
Shlomi says:
Good Day all,
could not find a solution’s on all above comment….maybe someone can help me with this ?
i have in a cell string such as 1.2.3.4 and i want to find if this string contains a number appear in other cell, lets say cell $A$1 = 3.
I’ve able to do it, BUT
when i have string such as: 1.2.13.4 , it looks on the 3 (on $A$1) and returned as if this number is part of the string.
But i want to check if it contain 13 NOT 3.I being thinking that maybe I should check first the Length on the A1 cell contain the number (3) and then check the cell with the string (1.2.13.4).
could not figure out how to do it….Try formula SEARCH or even by VBA , Instr(A1,»1.2.13.4″), but with no success. 🙁
any help will be appreciated!!!
Shlomi
-
TMR says:
Thank you very much for your wonderful lessons for excel beginners like me.
I have a question, if you have time, can you take a look and help me.
Everyday I have to download a list of products which are sold in my site.
Unfortunately, the site keep adding number to my product number to make it unique value to the system.
For example, my product number is A1.
In the first day, if 2 items of A1 were sold, then the system will shows A1-1, A1-2.
And sometimes, if I mistakenly use the description of the listing for A1-1 as a sample to list new listing then the model number will be come A1-1-2So I will have a list of A1, A1-1, A1-2, A1-1-2, A1-1-2-1… for example.
I want to delete all the added parts after A1.
Can you help me with this problem?Thank you very much for your time
-
@TMR
I’ll assume the A1-?? are in Column A
Use a helper Column =Left(A1,2) then copy that down
If The A1 can be A11 etc use =Left(A1,Find(«-«,A1)-1)
then copy that down-
TMR says:
Thank you Hui for your reply.
I see your answer and it really works with Product which has only 2 character like A1.
But I also have some other products which has number: FA-3MF-BK
or GFT-HJKU-25-BKJ
So when I download the file from the system it always like this:
FA-3MF-BK-1-2
FA-3MF-BK-3
GFT-HJKU-25-BKJ-1-1-2
GFT-HJKU-25-BKJ-2-1In this case, what should I do?
-
@TMR
How about posting the question in the Chandoo.org Forums
https://chandoo.org/forum/Please attach a sample file with example of the range of input and outputs you expect?
Otherwise we will be going around in circles for weeks
-
-
-
-
Jeff Faris says:
I need to be able to extract a date from this formatted lot number. F19021286Z. I don’t need the F or the Z. 19 (year) 02(month) 12(day) 86 (last two digits are sequence numbers and are un needed.
Do you have a formula that can do all of that?