Delete duplicate rows in excel

You can find the option in Microsoft Excel 2016 edition. If you are using Microsoft Excel, this is very easy to remove duplicate rows since there is an inbuilt option given by Microsoft. In such cases, you can use this following trick to remove duplicates rows in Excel and Google Sheets. However, let’s think that you have a spreadsheet containing 1500 rows and you want to remove duplicate rows in that spreadsheet. If there are ten duplicates, this is quite easy to delete them one by one. After merging, you can find some duplicate rows in your spreadsheet.

Let’s assume that you have two Excel sheets containing 50 rows in each sheet. If you decide to expand it to the entire data set, then choose “Expand the selection”.How to Remove Duplicate Rows in Excel and Google Sheets

If you selected a specific column, then a warning will appear to confirm that you want to limit removal to the column selected if yes, then be sure to select “Continue with current selection”.

If you selected the entire data set, then an option will appear asking you to specify which columns you wish to delete duplicates from if you want duplicates removed from the entire data set, then leave all the columns selected.

In the Data Tools section of the Data tab, select Remove Duplicates.

Navigate to the Data tab in the tool bar.

Select the data set that contains duplicates.

To remove all duplicate values, you can use the Remove Duplicates feature to, well, remove the duplicates! Using the Remove Duplicates feature:

All duplicate values should now be highlighted in red!Īfter reviewing the highlighted duplicates, you can determine whether all the duplicates should be removed or not. The default setting is light red highlighting with red font, which works very well. A window will appear detailing how Excel will highlight the duplicate values it identifies.In the menu that pops up, select Duplicate Values.In the Conditional Formatting menu, select Highlight Cells Rules.Navigate to the Home tab and select the Conditional Formatting button.If you want to identify duplicates across the entire data set, then select the entire set. Actually, you don’t have to select the entire data set you may want to identify duplicate values in a particular column or row. That may not be a big deal for a data set with about 50 rows of data, but it can be an incredibly inefficient process for a data set that contains, say, over 50,000 rows of data. Without this feature I would be forced to manually check each data point. The Conditional Formatting feature programmatically identifies duplicates in an entire data set. It usually takes finding one set of duplicate data points for me to determine that Conditional Formatting should be applied to identify if any additional duplicates are present in a data file. Thankfully, Excel offers two handy features that simplify the identification and removal of duplicate data points from a file! For instance, if I am looking at a data set on the number of hamsters across the United States and I see that Wisconsin has two data points, both of which are 50,000 (totally fabricated!), then I can infer that the data set has mistakenly included two duplicate values for Wisconsin.So why does this matter? It matters because duplicate data points may inadvertently lead to miscalculation or misunderstanding of the data. The appearance of duplicates does not necessarily mean the entire data set is completely wrong – only that the data set may require a closer eye and some additional clean-up work as do most data sets. Duplicates are exactly what they sound like: exact copies of the same data point. Duplicate data points are probably one of the most difficult to spot unless you’re lucky. These can include blank values, outlier data points, data label misspellings, and so on. There may be more than a few data points to double-check as you review and clean a data file. Stay tuned for Diana’s experiences, tips, and tricks with finding, analyzing and visualizing data. And now she is bringing her trials, tribulations, and expertise with data to you in a monthly blog, Tips with Diana.

The person that SAGE Publishing - the parent of MethodSpace - turns to when it has questions is Diana Aleman – editor extraordinaire for SAGE Stats and U.S. Collecting, analyzing, and reporting with data can be daunting.