rajeshkumar created the topic: shell script merge two list and remove duplicates
You want all the records from list_A supplemented by all the records from list_B for which there is not already a matching name in list A. Mathematically this is:
A + B - {w in B | (w,value) in A }
\
There are many ways of accomplishing this, depending on access and needed efficiencies.
* If you can modify DB1 (with A), then download table B from DB2, upload it to DB1, then extract your data with the appropriate join
* If you can’t modify DB1, then download both A and B and concatenate them to the same stream, with A followed by B. Then sort by the first field. Then process the stream one record at time. Duplicate names will be side-by-side. If the same name appears more than one time, print the first and ignore subsequent records with the same name.
Here is a sample solution to your problem (starting with two lists of names/values)
#!/bin/bash
A="Smith value1
Jones value2
Wilson value3"
B="Smith value10
Wilson value11
Fox value12
Brown value13"
PrevName="Not a valid name"
echo "$A
$B" | sort -k1 |
while read Name Value
do
if [ "$Name" != "$PrevName" ]; then
echo $Name $Value
fi
PrevName="$Name"
done > outfile
You want all the records from list_A supplemented by all the records from list_B for which there is not already a matching name in list A. Mathematically this is:
A + B – {w in B | (w,value) in A }
There are many ways of accomplishing this, depending on access and needed efficiencies.
* If you can modify DB1 (with A), then download table B from DB2, upload it to DB1, then extract your data with the appropriate join
* If you can’t modify DB1, then download both A and B and concatenate them to the same stream, with A followed by B. Then sort by the first field. Then process the stream one record at time. Duplicate names will be side-by-side. If the same name appears more than one time, print the first and ignore subsequent records with the same name.
Here is a sample solution to your problem (starting with two lists of names/values):
#!/bin/bash
A=”Smith value1
Jones value2
Wilson value3″
B=”Smith value10
Wilson value11
Fox value12
Brown value13″
PrevName=”Not a valid name”
echo “$A
$B” | sort -k1 |
while read Name Value
do
if [ “$Name” != “$PrevName” ]; then
echo $Name $Value
fi
PrevName=”$Name”
done > outfile
Here is the output:
Brown value13
Fox value12
Jones value2
Smith value1
Wilson value11
Regards,
Rajesh Kumar
Twitt me @ twitter.com/RajeshKumarIn
- Installing Jupyter: Get up and running on your computer - November 2, 2024
- An Introduction of SymOps by SymOps.com - October 30, 2024
- Introduction to System Operations (SymOps) - October 30, 2024