Removing Duplicates
Results 1 to 9 of 9

Thread: Removing Duplicates

  1. #1
    Flash M0nkey
    Join Date
    Sep 2001
    Posts
    3,447

    Removing Duplicates

    Ok this is hard to explain but will give it my best

    ok I have 2 excel sheets which contain client details - one is the list of clients who have been processed the other is those who "may" need to be processed - each client has a unique ID number

    So what I need is a way of comparring the ID's from each sheet so if we have customer details ::::

    Code:
    Client Name RefrenceID Address
    Mr J Smith  1101345678 123 someplace
    who is on the list of customers who may need to be processed - I can take the refrence number 1101345678 and compare that to the list of completed clients - if the refrence is on the list of completed then I can strike him of the list to be done.

    there is some 1000 that need to be done so manually going through and checking is something i would prefer not to do.

    I can export from excel to access/txt/etc

    but does anyone know of a quick and easy way of me doing this??

    v_Ln

  2. #2
    Senior Member nihil's Avatar
    Join Date
    Jul 2003
    Location
    United Kingdom: Bridlington
    Posts
    17,190
    Hi Val,

    I would go at it the other way round? search for a match FROM the processed records to the unprocessed ones and write the matching records ( from the unprocessed file) to a new file ( i.e. move them).

    I am assuming that the processed file is much larger?

    I would copy the files into Access tables personally, as I find it much easier to manipulate data in Access.

    Just my thoughts

  3. #3
    Flash M0nkey
    Join Date
    Sep 2001
    Posts
    3,447
    what i want it to do tho is leave me all the records in the "maybe needs done" file which dont get matches form the processed that way I know which customers accounts still need to be processed.

    My only problem is how to actually get them to compare against each other and then remove the ones which match up?

    what prog should I do it in? how should i go about it? have tried everything - only thing i could think of doing was loading them into crystal and then trying

    {maybe_done} <> {completed}

    but that didn't work - i could go through and enter all the refrence numbers from the completed into the crystal formula like

    not ({maybe_done} in [136153, 136154]) so it only outputs the ones which it does't get a match for but i dont fancy sitting inputing 1000+ refrence ideas into the formula as is time consuming and would prolly take an age to run.

    basically if you think of it like this

    textfile1.txt and textfile2.txt is my 2 files - each file contains a list like

    Code:
    1429231759.00,
    1420478617.00,
    1418104412.00,
    1397702679.00,
    1425478656.00,
    1443229615.00,
    1437766534.00,
    1428725653.00,
    1429261139.00,
    1429868842.00,
    1429221389.00,
    1425622415.00,
    1425472917.00,
    cept with 1000+ numbers in it

    I need someway of getting a prog to compare the contents of file2 to file1 if a number appears in each file delete it and move on to next - leaving me with only the refrence IDs of those clients who still need looked at

    thanks

    v_Ln

  4. #4
    Senior Member
    Join Date
    Oct 2003
    Posts
    234
    I'm thinking use an excel macro or a SQL query in access.

    Try this here

    Sorry if this doesn't quite answer your question, I haven't used Microsoft Office macros in some time.

    //edit: If you have it as a flat-file, try inputing each file line by line into a PERL script, split the lline up, store the numbers in a array, and checking the numbers against the numbers in the other. Very brute-forcish, and may take a long time, but better than doing it by hand I think.

  5. #5
    Flash M0nkey
    Join Date
    Sep 2001
    Posts
    3,447
    I know how to make a query in Access that would strip out duplicates from the same columnbut as these are in 2 seperate sheets - not the same column - using the query doesn't really help

    can't combine them into same column as then i wouldn't be sure which refrences are from completed or which from need done after it had processed them >_<;

    there must be an easy way of doing this am just over-looking

    v_Ln

  6. #6
    Senior Member
    Join Date
    Oct 2003
    Posts
    234
    Beat me to my edit

    I would then use a excel macro, which I think can look at two sheets at a time, or use the PERL method.

  7. #7
    Flash M0nkey
    Join Date
    Sep 2001
    Posts
    3,447
    ugh thats the one thing was hoping to avoid - as my excel (macro) skills are limited - as in aint wrote one in like7yrs - lol
    will get excel guy onto it tomorrow see what he can come up with - its now 20:38 and I can't be bothered - still got the nightly reports to do >_<;

    v_Ln

  8. #8
    Senior Member
    Join Date
    Oct 2003
    Posts
    234
    Check the link I posted, it's a beginners intro to excel macros (not a tut on SQL, sorry if there was any confusion) and I think will give you all the info you need.

  9. #9
    Senior Member nihil's Avatar
    Join Date
    Jul 2003
    Location
    United Kingdom: Bridlington
    Posts
    17,190
    Hi Val,

    What version of MS Office are you running?

    what i want it to do tho is leave me all the records in the "maybe needs done" file which dont get matches form the processed that way I know which customers accounts still need to be processed.
    That is what I am proposing.

    1. Create three access tables: A: Might Be Processed, B: Processed Workfile, C: Processed Master

    2. Copy data from your two excel tables into "A" and "C"

    3. Read "C" against "A" and move the matches into "B"

    4. Check that record count in "A" + "B" = "C"

    5. Backup your system

    6. Create a copy of your Excel "maybe file".....the one with 1000 records.

    7. Import the data from Access table "A"........confirmed to be processed

    8. Rename the existing table to "XYZ.old" or whatever

    9. Rename the new excel table as the original .

    10. Confirm that records in new excel + "B" = 1000, or whatever the exact number was in "XYZ.old"

    If you give me the following about your two excel tables I could take a closer look:

    Column heading, column data type, maximum column record length by column.

    Cheers

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •