Replicate Excel’s VLOOKUP Function

Why replicate Excel’s VLOOKUP Function?

As much as some people try to avoid VLOOKUP(), it is an incredibly useful function for Excel pros. Those who love it will certainly want to replicate its functionality in Power Query at some point. The thing is, however, depending on which version of VLOOKUP() you need, it can be quite tricky to implement.

Determining an Exact Match

In truth, we don’t need to do anything special to emulate VLOOKUP’s Exact Match, as this functionality can be replicated by simply merging two tables together.

Making an Approximate Match

Replicating VLOOKUP()’s approximate match is a totally different case than the exact match scenario. It requires some logic to emulate these steps, as we’re not trying to match records against each other, we’re actually trying to find the closest record to our request without going over.

Download the Workbook

Our Goal

In this example file you’ll find two tables. This table is named BandingLevels:

monkey_ch22-04

And this one is named DataTable:

monkey_ch22-05

If you review the functions in columns B:D of the DataTable, you’ll see that they contain VLOOKUP() functions as shown in the column headers. Each column is looking up the value shown in column A for that row against the BandingLevels table. Columns B and D are returning the value from column 2 of the BandingLevels table, where Column C is returning the value from the Alt Band column of the BandingLevels table.

In addition, you’ll notice that columns B and C are returning approximate matches because the 4th parameter has either been set to True or omitted. Column D, however, is asking for an exact match, (as the 4th parameter has been set to False,) resulting in all records returning #N/A except for the very last one.

Our goal in this scenario is to replicate Excel’s VLOOKUP function exactly, making it accept the same parameters, in the same order, and return consistent results. While we won’t create the function from scratch here, we will show you the function, and explain how it works.

Step 1: Insert a VLOOKUP function

The first thing we need to do is insert the VLOOKUP function. To do that:

  • Create a new query –> From Other Sources –> Blank Query
  • Open the Advanced Editor
  • Select all the code in the window
  • Replace it with the following code

(lookup_value as any, table_array as table, col_index_number as number, optional approximate_match as logical ) as any =>
let
/*Provide optional match if user didn't */
matchtype =
if approximate_match = null
then true
else approximate_match,
/*Get name of return column */
Cols = Table.ColumnNames(table_array),
ColTable = Table.FromList(Cols, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
ColName_match = Record.Field(ColTable{0},"Column1"),
ColName_return = Record.Field(ColTable{col_index_number - 1},"Column1"),
/*Find closest match */
SortData = Table.Sort(table_array,{{ColName_match, Order.Descending}}),
RenameLookupCol = Table.RenameColumns(SortData,{{ColName_match, "Lookup"}}),
RemoveExcess = Table.SelectRows(RenameLookupCol, each [Lookup] <= lookup_value),
ClosestMatch=
if Table.IsEmpty(RemoveExcess)=true
then "#N/A"
else Record.Field(RemoveExcess{0},"Lookup"),
/*What should be returned in case of approximate match? */
ClosestReturn=
if Table.IsEmpty(RemoveExcess)=true
then "#N/A"
else Record.Field(RemoveExcess{0},ColName_return),
/*Modify result if we need an exact match */
Return =
if matchtype=true
then ClosestReturn
else
if lookup_value = ClosestMatch
then ClosestReturn
else "#N/A"
in Return

  • Click Done
  • Rename the function to pqVLOOKUP
  • Go to Home –> Close & Load

Step 2: Create a Pointer to the Lookup Table

This part is fairly straight forward, we simply need to create a pointer to the table we are going to look up our data in: the BandingLevels table.

  • Select any cell in the BandingLevels table
  • Create a new query –> From Table
  • Go to Home –> Close & Load to… –> Only Create Connection

You’re done with this stage.

Step 3: Test the VLOOKUP function

We are now ready to see how it works. Let’s pull in the DataTable, and remove all of the Excel versions of the functions.

  • Select any cell in the DataTable table
  • Create a new query –> From Table
  • Right click the Values column –> Remove other columns

We are now down to a single column of data:

monkey_ch22-06

Explicit Approximate Match

So let’s see if the pqVLOOKUP function works for us. First we’ll try to replicate the following formula:
=VLOOKUP([Values],BandingLevels,2,true)
To do that, we can take the following steps:

  • Go to Add Column –> Add Custom Column
  • Name the column “2,True”
  • Use the following formula:

=pqVLOOKUP([Values],BandingLevels,2,true)
The results are identical to what Excel would show:

monkey_ch22-07

Implicit Approximate Match

Let’s try leaving the ,true off the end and returning the 3rd column instead from the lookup table instead of the 2nd:

  • Go to Add Column –> Add Custom Column
  • Name the column “3,default”

Use the following formula:
=pqVLOOKUP([Values],BandingLevels,3)
The results are again identical to what Excel would show:

monkey_ch22-08

Exact Match

Let’s try one more. What if we wanted to define an exact match against the 2nd column of the lookup table?

  • Go to Add Column –> Add Custom Column
  • Name the column “2,false”

Use the following formula:
=pqVLOOKUP([Values],BandingLevels,2,false)
And once again, the results are spot on with what Excel delivers:

monkey_ch22-09

The query can now be finalized:

  • Go to Home –> Close & Load

Caveats

Even though you can use this function to emulate VLOOKUP’s exact match, you shouldn’t. The reason is that you can accomplish an exact match effect by merging tables together; a method that will be much faster. If you need the approximate match functionality, however, this is a viable method.

We should also make you aware of one minor difference between Excel’s VLOOKUP() and the pqVLOOKUP function; the #N/A value returned by pqVLOOKUP is actually text, not a true error:

monkey_ch22-10

Returning text is as close as we could get when returning error, as there is no way to output a true #N/A error in Power Query. (Any errors inside Power Query show up as empty cells when loaded in to an Excel Table.)

So how does the pqVLOOKUP function work? Let’s take a look at the code:
(lookup_value as any, table_array as table, col_index_number as number, optional approximate_match as logical ) as any =>
let
/*Provide optional match if user didn't */
matchtype =
if approximate_match = null
then true
else approximate_match,
/*Get name of return column */
Cols = Table.ColumnNames(table_array),
ColTable = Table.FromList(Cols, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
ColName_match = Record.Field(ColTable{0},"Column1"),
ColName_return = Record.Field(ColTable{col_index_number - 1},"Column1"),
/*Find closest match */
SortData = Table.Sort(table_array,{{ColName_match, Order.Descending}}),
RenameLookupCol = Table.RenameColumns(SortData,{{ColName_match, "Lookup"}}),
RemoveExcess = Table.SelectRows(RenameLookupCol, each [Lookup] <= lookup_value),
ClosestMatch=
if Table.IsEmpty(RemoveExcess)=true
then "#N/A"
else Record.Field(RemoveExcess{0},"Lookup"),
/*What should be returned in case of approximate match? */
ClosestReturn=
if Table.IsEmpty(RemoveExcess)=true
then "#N/A"
else Record.Field(RemoveExcess{0},ColName_return),
/*Modify result if we need an exact match */
Return =
if matchtype=true
then ClosestReturn
else
if lookup_value = ClosestMatch
then ClosestReturn
else "#N/A"
in Return

The code is fairly long and complex, and uses a variety of tricks, but the basic methodology is this:

  1. Pull in the data table
  2. Sort it descending by the first column
  3. Remove all records greater than the value being searched for
  4. Return the value in the requested column for the first remaining record unless an Exact match was specified
  5. If an Exact match was specified, then test to see if the return is a match. If it is, return the value. If not, return #N/A.

Items that are of note in the code:

  • Each of the variables in the parameters has an explicit data type declared. This is to prevent the user from accidentally providing a table where a numeric column number is required.
  • The approximate_match variable is defined as optional, meaning the user can ignore it.
  • The matchtype variable tests to see if a match type was specified. If a match type was specified, it is assigned to the matchtype variable, but if not (approximate_match is null), then a value of true will be assigned.
  • The name of the column to be returned is pulled by reviewing the column headers of the table, splitting them into a list of records, then pulling out the record whose index matches the requested column (less 1 to adjust to base 0).
  • The data is sorted in descending order based on the column to be searched. All records greater than the requested value are removed (by selecting all rows where the value is less than or equal to the value being searched for.)
  • If no rows remain, a #N/A result is then stored, but if there are rows, then the first record in the lookup column is stored. This result can later be checked to see if it matches the record being searched for (important for the exact match scenario)
  • The approximate match value is then (always) calculated, even if an exact match was requested. If no rows are present in the data set a #N/A result is stored, otherwise the closest value is pulled from the return column.
  • The final test checks the type of match requested. If it is an approximate match, then the closest match is always returned (which may be #N/A). If, however, the match type was an exact match, the code will return #N/A instead of the closest match unless the lookup column’s value matched the value being sought exactly.

It should go without saying that this is not a function you’ll knock up in a few minutes. It is long and complicated, and took several hours of development and debugging in order to get it correct. It is, however, a fairly robust function in the way it works, and showcases how to build complex functions using Power Query..

Power Query Patterns
Power Query Book
Power Query Workshop

this work was shared 0 times
 700
About

 Power Query Training

  (13 articles)

We are a team committed to delivering the best content for new and seasoned users of a tool that we believe in: Power Query. We firmly believe that this is THE best tool to be added to Microsoft Excel in a long time.

  • Tan

    unable to download the sample file

    • hey! Sorry about that. We just fixed the url and you should be able to download the file now.

  • Terry

    How can I repeat/automate the transformation steps on 3 different files which are updated daily? Currently I am doing the transformations daily creating 3 appends and then joining in the data model. The files are all excel

    • Ken Puls, FCPA, FCMA

      Hi Terry,

      If you have our book, the answer is most likely covered by Chapter 21 on custom functions. If you don’t have the book (or that’s not what you’re looking for), I’d suggest posting the question in our Power Query help forum (on my site) at http://www.excelguru.ca/forums. That would allow us to work through the problem with your more completely (and even allows you to attach sample data so we can see exactly what you’re working with.)

  • KC

    Hi, the pqVlookup function work perfectly. However, I found that if I am adding more columns with pgVlookup against a large remove excel file (+200MB), it took years to refresh the table. I am not sure if it’s due to CPU bound by my PC or it really takes time to do the lookup. Please see if you have similar issue.

    • Ken Puls, FCPA, FCMA

      Hi KC,

      Yes, in large data sets this can be an issue as the BandingLevels table is being refreshed for every row of the calling table. To fix this, I’d suggest adding a step before you call the VLOOKUP function like this:
      CachedTable=Table.Buffer(BandingLevels)

      And then pass CachedTable instead of BandingLevels in. That will load BandingLevels into memory, and prevent it from being re-loaded on each row.

  • saprano

    Thanks very much for this Ken/Miguel!…This is a great bit of code for looking up a category/band (inexact match) in M/Power Query.

    I needed to tweak the function to look only at specific bands (i.e. filter the table with the bands by a specific category which also needs to be referenced)…code copied below in case this is useful for anyone

    ____________

    (category_col_name as text, category_value as text, lookup_value as any, table_array as table, col_index_number as number, optional approximate_match as logical ) as any =>

    let

    /*Provide optional match if user didn’t */

    matchtype =

    if approximate_match = null

    then true

    else approximate_match,

    /*Get name of return column */

    Cols = Table.ColumnNames(table_array),

    ColTable = Table.FromList(Cols, Splitter.SplitByNothing(), null, null, ExtraValues.Error),

    ColName_match = Record.Field(ColTable{0},”Column1″),

    ColName_return = Record.Field(ColTable{col_index_number – 1},”Column1″),

    /*Find closest match */

    FilterData = Table.SelectRows(table_array, each Record.Field(_, category_col_name) = category_value),

    SortData = Table.Sort(FilterData ,{{ColName_match, Order.Descending}}),

    RenameLookupCol = Table.RenameColumns(SortData,{{ColName_match, “Lookup”}}),

    RemoveExcess = Table.SelectRows(RenameLookupCol, each [Lookup] <= lookup_value),

    ClosestMatch=

    if Table.IsEmpty(RemoveExcess)=true

    then "#N/A"

    else Record.Field(RemoveExcess{0},"Lookup"),

    /*What should be returned in case of approximate match? */

    ClosestReturn=

    if Table.IsEmpty(RemoveExcess)=true

    then "#N/A"

    else Record.Field(RemoveExcess{0},ColName_return),

    /*Modify result if we need an exact match */

    Return =

    if matchtype=true

    then ClosestReturn

    else

    if lookup_value = ClosestMatch

    then ClosestReturn

    else "#N/A"

    in Return

close

The Best Power Query Course in the World!


  • circle
    Best Content Worldwide
  • circle
    Top Level Support
  • circle
    Over 12 hours of content
  • circle
    More rewards!