Modify

Opened 14 years ago

Closed 14 years ago

#205 closed defect (fixed)

email2trac chokes on non-ascii (utf8) characters in workflow

Reported by: eirik.schwenke@… Owned by: bas
Priority: major Milestone:
Component: email2trac Version: trunk
Keywords: unicode utf8 encoding crashing Cc:

Description

We had an initial ticket status of "forespørsel", and that caused email2trac to fail with the error(s):

email2trac: Traceback (most recent call last): 
email2trac:   File "/usr/bin/email2trac", line 2133, in <module>     tktparser.parse(sys .stdin) 
email2trac:   File "/usr/bin/email2trac", line 1531, in parse     self.new_ticket(m, sub ject, spam_msg) 
email2trac:   File "/usr/bin/email2trac", line 967, in new_ticket     self.set_ticket_fi elds(tkt) 
email2trac:   File "/usr/bin/email2trac", line 876, in set_ticket_fields     print 'trac .ini name %s = %s' %(name, value) 
email2trac: UnicodeEncodeError: 'ascii' codec can't encode character u'\xf8' in position 27: ordinal not in range(128) 

It would appear some more care is needed to support unicode in all strings in email2trac. I had a brief look at the script, but couldn't easily see if it would be safe and sound to simply wrap all missing strings with a .encode('utf-8') or not.

For now the workaround has been to limit ourselves to ascii-characters in ticket-status names -- but that is obviously not a good solution (it's looks a bit strange in Norwegian, in eg. Japanese it would be hopeless).

Attachments (0)

Change History (5)

comment:1 Changed 14 years ago by bas

  • Status changed from new to assigned

thanks for reporting. Can you attach the offending email or email it as attachment to basv@…. Just another question did you enabled the debug option? If yes can your turned it off and try again.

comment:2 Changed 14 years ago by Eirik Schwenke <eirik.schwenke@…>

Yes, debug was enabled. The email is irrelevant -- everything broke with utf-8 names for ticket statuses (especially for utf8-names in the initial status).

As the site is in production, I can't really try with/without debugging -- at least not right now.

The problem appears to be similar to all other unicode-related problems, only this time the string that contains unicode, is one of the ticket fields.

The offending section in trac.ini was:

[ticket]
default_priority = normal
default_type = forespørsel

When changed to:

[ticket]
default_priority = normal
default_type = foresporsel

Note: I didn't have to change the ticket-types, although I have done so now -- I'm not entirely sure how trac/email2trac looks up ticket-types -- apparently the fact that there was no "foresporsel" ticket-type didn't seem to have any effect on email2trac.

After changing the line in trac.ini email2trac is working again -- but that is just a workaround, obviously.

Best regards,

Eirik S

comment:3 Changed 14 years ago by Eirik Schwenke <eirik.schwenke@…>

Note, I see that the ticket might better be named "non-ascii characters in ticket default_type", rather than workflow.

In general it appears all of email2trac needs to assume all strings can be unicode/non-ascii. Not entirely sure if the best thing is to wrap everything in .encode('utf-8') -- as emails can contain other non-ascii characters than utf8 -- and I don't know enough about what .encode() actually does.

If it manages to eg: endcode both latin1, EUC and utf16/32 without problems, simply plastering the email2trac-code with .encode()-calls should do the trick.

comment:4 Changed 14 years ago by bas

(In [378]) email2trac.py.in, fixed some unciode errors when debug is enabled, see #205

comment:5 Changed 14 years ago by Eirik Schwenke <eirik.schwenke@…>

  • Resolution set to fixed
  • Status changed from assigned to closed

I can confirm that this solves our issue.

Thank you!

Add Comment

Modify Ticket

Change Properties
Action
as closed The owner will remain bas.
The resolution will be deleted. Next status will be 'reopened'.
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.