Encapsulating Unicode From Redis
Solution 1:
I'm not sure that there is a problem.
If you remove all of the .encode('utf8')
calls in your code it produces a correct file, i.e. the file is the same as the one produced by your current code.
>>> r_server = redis.Redis('localhost')
>>> r_server.keys()
[]
>>> r_server.sadd(u'Hauptstädte', u'東京', u'Godthåb',u'Москва')
3>>> r_server.keys()
['Hauptst\xc3\xa4dte']
>>> r_server.smembers(u'Hauptstädte')
set(['Godth\xc3\xa5b', '\xd0\x9c\xd0\xbe\xd1\x81\xd0\xba\xd0\xb2\xd0\xb0', '\xe6\x9d\xb1\xe4\xba\xac'])
This shows that keys and values are UTF8 encoded, therefore .encode('utf8')
is not required. The default encoding for the redis
module is UTF8. This can be changed by passing an encoding when creating the client, e.g. redis.Redis('localhost', encoding='iso-8859-1')
, but there's no reason to.
If you enable response decoding with decode_responses=True
then the responses will be converted to unicode using the client connection's encoding. This just means that you don't need to explicitly decode the returned data, redis
will do it for you and give you back a unicode string:
>>> r_server = redis.Redis('localhost', decode_responses=True)
>>> r_server.keys()
[u'Hauptst\xe4dte']
>>> r_server.smembers(u'Hauptstädte')
set([u'Godth\xe5b', u'\u041c\u043e\u0441\u043a\u0432\u0430', u'\u6771\u4eac'])
So, in your second example where you write data retrieved from redis to a file, if you enable response decoding then you need to open the output file with the desired encoding. If this is the default encoding then you can just use open()
. Otherwise you can use codecs.open()
or manually encode the data before writing to the file.
import codecs
cities_tag = u'Hauptstädte'with codecs.open('capitals.txt', 'w', encoding='utf8') as f:
while r_server.scard(cities_tag) != 0:
city = r_server.srandmember(cities_tag)
f.write(city + '\n')
r_server.srem(cities_tag, city)
Post a Comment for "Encapsulating Unicode From Redis"