UCL Discovery Stage
UCL home » Library Services » Electronic resources » UCL Discovery Stage

Reconstructing financial statements

Bennett Jr, F.G.; (2008) Reconstructing financial statements. Presented at: DESI II: Second International Workshop on Supporting Search and Sensemaking for Electronically Stored Information in Discovery Proceedings, University College London, UK. Green open access

[thumbnail of 9137.pdf]
Preview
PDF
9137.pdf

Download (426kB)

Abstract

This paper introduces a tool for the reconstruction and validation of categorized totals embedded in untrusted and unformatted text, such as OCR scans of nancial statements. The tool is a spino of academic research into the funding of Japanese third-sector organizations, the annual reports of which are frequently published reports in the form of PDF les containing document images. A number of techniques at string- line- and document-level are used to resolve ambiguities and obtain the greatest possible recovery rate for the underlying data, while excluding the content of untrustworthy documents from the nal sample. In a preliminary trial \in the wild", the tool has returned validated income totals for 47.9% of the documents in a heterogeous set of 2205 annual reports.

Type: Conference item (Presentation)
Title: Reconstructing financial statements
Event: DESI II: Second International Workshop on Supporting Search and Sensemaking for Electronically Stored Information in Discovery Proceedings
Location: University College London, UK
Dates: June 25, 2008
Open access status: An open access version is available from UCL Discovery
Publisher version: http://www.cs.ucl.ac.uk/staff/S.Attfield/desi/DESI...
Language: English
URI: https://discovery-pp.ucl.ac.uk/id/eprint/9137
Downloads since deposit
38,016Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item