Deep Learning is nowadays the standard tool for classification tasks and is used not only for differentiating cats and dogs but also for industry and private life applications (e.g. insurance document input management, autonomous driving, molecule folding, …)
For tasks beyond classification, more layers of information are required: Named Entities.
Named entities on insurance documents are usually IBANs, addresses, customer numbers, specific dates and amounts etc… Successful extraction of named entities enables more precise classification tasks and automatized document processing for instance. For instance distinguishing between company address, customers address and an address of the local company outlet.
In this presentation, we discuss some promising approaches we developed for and inside ERGO to extract those named entities.
Further, we elaborate on the occurring challenges (not only inside primary insurers) of generating a labeled data set, train scalable models and the corresponding model performance.