I was in the middle of data transforming to create a clean dataset for a Charticulator post, that I come across a dataset as
This dataset is for Carbon dioxide emissions in different countries around the world.
The only columns I need from this datasets is about country name and the number of emissions for each year ( that is the year column) I need to remove other columns.
There are two types of columns :
Columns with specific patterns name: In the above example, Type, Type_2, Type_3, and so forth and also, Footnotes_1, Footnotes_2, Footnotes_3, and so forth…
Columns without specific patterns:
Such as country code, series codes and MDG.
IN this example I need to remove all the above columns, for the columns without a pattern, in this example, there are not many, so I decided to manually remove them, but for the column with a specific pattern in their name, I wrote some codes in Power Query
As a result, I start to check the availabe column ‘name. To do that, I click on the Advanced Editor
Replace the last three line with below
columnsname=Table.ColumnNames(#"Removed Columns" )
In this code we are going to list the name of the columns as below, then we are going to convert the list to Table.
Now, I want to find and filter columns that has specifi pattern
1- all columns start with Type
2- All column Start with Footnote
So in the new column, I am going to use the text filter
Next, in the new window, select the “Begin With” option. Power query is Case Sensitive, so make sure to write it down as it is.
We are going to convert the columns to list again using menu in Powe Query Editor, Click on Transform then Convert to list
Then back to Advance Editor to write some codes. We have the list of columns that need to be remove on the Column1. Also, we have the list of all columns as well
I am going to add some code at the end as below, that we use the function, RemoveColumns that accepted two parameters: The whole table and the name of columns need to be removed. I have the list of the whole table on #”Removed Columns” and the list of columns need to be removed from Column1
Column1 = #"Filtered Rows"[Column1],
RemoveColumns=Table.RemoveColumns( #"Removed Columns",Column1)
So the result is :
As you can see all columns that start with Type or Footnote have been removed.