Deleting rows based on Duplicates in a column

Closed
jklup Posts 1 Registration date Thursday November 18, 2010 Status Member Last seen November 18, 2010 - Nov 18, 2010 at 10:58 AM
wingamer Posts 5 Registration date Friday January 7, 2011 Status Member Last seen January 7, 2011 - Jan 7, 2011 at 06:04 AM
Hello,

I have a database in which I administer. It is in MS SQL 2005. I pull the data into excel and delete the duplicates if any exist. However, I have recently found a flaw in a number of sections in my DB which is the issue i will eventually fix but in the short term i was wondering if anyone knows of a solution to my problem.

Problem: In my first column I have my PKID, which in Excel is unique to each uploaded item. However, my db is returning multiple of the same PKID and displaying a very slight difference in one of the other columns that will not allow it to remove duplicates however, their should only be one PKID for each item. I have used conditional formating to find all duplicated PKIDs

Hopeful Solution: I was wondering if their is a way to create a macro that would say for every duplicate PKID remove one or more of the duplicated rows so their is only one pkid left. At this point in time I have just created a macro that deletes the rows that I manually highlight, which in a dataset of 9000 rows this can take a very long time.

Any ideas? I'm sorry if this was not descriptive enough please let me know if their are any questions with my information as well!

8 responses

Are you using Excel 2007? Then use the 'Remove Duplicates' feature.
in Excel 2003, in a spare column: if(A1=A2,"DUPE","") then copy it down all rows.
In SQL can you use select distinct ...
Yes I am using 2007. But this only works under the assumption that every column select is identical correct? because in this case I have multiple PKIDs and need to condense into one however, in one of the columns their is a slight enough difference that it doesn't see it as a "true" duplicate.
In this you can still use the 2003 version I showed you. All this is doing is checking a single column for dupes, assuming they are in order.
yes, but my understanding is this will only get rid of the dupes in that column?
ok, i was assuming that the pkid was duplicated , Completely misunderstood the scope.
Can you paste a few sample rows to give a better idea?
No problem I should have been more specific. I can't show you the exact rows, but i have created a little idea

Pkid MFPN QTY Processor Descripton
1 12345 2 Pentium DT Pentium E5300 2600
1 12345 2 Pentium DT Pentium E5300 0
2 e545t 30 Athlon MB Athlon 2299.9999
2 e545t 30 Athlon MB Athlon 2300


As you can see it pulled multiple of the same PKID (should only have one) and the description was pulled 2 different ways. I may have to fix the incision before I can layer it with a band aid. It's pulling it as two different MFPNs (manufacturing part numbers) in my DB and in this case I would want to delete one of the two duplicates the definition doesn't matter as much.

Thanks for your help!
Unfortunately this is mine too, and since it's not consistently only two duplicates of one PKID I find it's impossible to just create a greater than macro saying if pkid is duplicate than take first or greater row