java - UIMA for parsing emails -
i new uima.
i want develop app using uima , uimafit can parse email related air tickets, such confirmation email, cancellation email etc. , extract valuable information ticket number, flight number, departure time, arrival time, passenger name etc. how can achieve using uimafit. tried use uimafit read string , regular expression tried extract information, seems complicated email not structured. suggestions of how connect emails , perform parsing without using regex.
any suggestions.
is set of types of emails (confirmation email, cancellation email etc) small enough? if yes, in first step, try simple classification types of email. in next steps, can apply different tools based on type of email.
for rest, think it's best use regexes, if tedious. might want @ uima textmarker implement regexes/rules.
- ticket number: regex
- flight number: regex
- departure time, arrival time: regex
- passenger name: person ner (here uima example) (or match email to: field?)