How To Make A Unique Data From Strings
Solution 1:
stack.txt
below contains this:
"India1,India2,myIndia"
"Where,Here,Here"
"Here,Where,India,uyete"
"AFD,TTT"
Here you go:
from collections import OrderedDict
with open("stack.txt", "r") as f:
# read your data in from the gist site and strip off any new-line characters
data = [eval(line.strip()) for line in f.readlines()]
# get individual words into a list
individual_elements = [word for row in data for word in row.split(",")]
# remove duplicates and preserve order
uniques = OrderedDict.fromkeys(individual_elements)
# convert from OrderedDict object to plain list
final = [word for word in uniques]
print(final)
Which yields this:
['India1', 'India2', 'myIndia', 'Where', 'Here', 'India', 'uyete', 'AFD', 'TTT']
Edit: To get your desired output, just print the list in the format you want:
print("\n".join(final))
Which is equivalent, from an output standpoint, to this:
for x in final:
print(x)
Which yields this:
India1
India2
myIndia
Where
Here
India
uyete
AFD
TTT
Solution 2:
Why using numpy??? and I'm not sure if you want to use the same file as input and output
#!/usr/bin/env python
# give a name to my data
inputData = """India1,India2,myIndia
Where,Here,Here
Here,Where,India,uyete
AFD,TTT"""
# if you want to read the data from a file
#inputData = open(fileName, 'r').readlines()
outputData = ""
tempData = list()
for line in inputData.split("\n"):
lineStripped = line.strip()
lineSplit = lineStripped.split(',')
lineElementsStripped = [element.strip() for element in lineSplit]
tempData.extend( lineElementsStripped )
tempData = set(tempData)
outputData = "\n".join(tempData)
print("\nInputdata: \n%s" % inputData)
print("\nOutputdata: \n%s" % outputData)
Solution 3:
It sounds like you probably have a csv file. You don't need numpy for that, the included batteries are all you need.
import csv
data = []
with open('test.txt') as f:
reader = csv.reader(f)
for row in reader:
data.extend(row)
You can .extend
lists rather than .append
to them. It's basically like saying
for thing in row:
data.append(thing)
That will still leave the duplicates, though. If you don't care about order you can just make it a set
and call .update()
instead of extend:
data = set()
with open('test.txt') as f:
reader = csv.reader(f)
for row in reader:
data.extend(row)
And now everything is unique. But if you care about order you'll have to filter things down a bit:
unique_data = []
for thing in data:
if thing not in unique_data:
unique_data.append(thing)
If your test.txt
file contains this text:
"India1,India2,myIndia "
"Where,Here,Here "
"Here,Where,India,uyete"
"AFD,TTT"
And not
India1,India2,myIndia
Where,Here,Here
Here,Where,India,uyete
AFD,TTT
Then you don't quite have a csv. You can either fix what's generating your csv or manually remove the quotes or just fix it on the fly.
def remove_quotes(file):
for line in file:
yield line.strip('"\n')
reader = csv.reader(remove_quotes(f))
Post a Comment for "How To Make A Unique Data From Strings"