Take your analog data digital
Take your analog data digital for a faster, more efficient way to work
We live in an increasingly digital world. We’re used to having almost all the data we need at our fingertips with the click of a button or a tap of the screen. But frequently information still gets relayed to us in ways that aren’t digital-such as paper receipts, handouts art conferences, or notes from a whiteboard at a meeting. That’s why, at Microsoft, we’ve been developing ways for you to easily move your analog data into a digital format to help you be more productive.
Quickly capture paper-based data to unlock new insights
To help you bring analog data into Excel, we developed the Insert Data from Pictures feature. With this feature, you can easily grab any data in a table format-financial spreadsheets, work schedules, task lists, timetable and so on-and convert it to a digital format in Excel, so you can arrange and analyze that information quickly and in context to make better decisions on the fly.
The Insert Data from pictures feature works by combining advanced optical character recognition (OCR) technology, layout understanding techniques, and machine learning models to transform paper-based information into digital data. We’ve used these and other technologies across Office apps, including the PDF Reflow feature for Word and Office Lens and in the Seeing AI app.
Convert handwritten notes to digital text with ease
Let’s look at how we’re helping users go from analog to digital. Before, you had to copy whiteboard notes by hand at the end of the meetings. Later, you could take photos of whiteboards with your phone. Either way, you still had to type in the notes later. Now, with ink grab you an take a picture of the notes scribbled on a physical whiteboard, convert into tools like OneNote, so you can convert notes to text quickly to share in messages, documents, or presentations.
We’re also exploring more advanced ways to help you convert analog data to digital information that you can use across your Office apps. For example, we envision that you’ll be able to take a picture of handwritten notes on paper and import the text directly. Other areas we’re exploring include scanning a picture PDF annotation and signing.
Analyzing the physical world
In addition to importing data from a physical piece of paper, there are many other ways we see customers leveraging Excel to help them analyze data from the real world. For example, with the Hacking STEM program, teachers use Excel to help students explore and analyze real-world phenomena. Leveraging the Excel Data Streamer add-in, students can easily move data from the physical world in and out Excel-introducing them to data science and the internet of things (IoT)-for example, using pressure sensors to measure brain impact during a concussion.
More on how the features work-available on iOS and Android
Insert Data from pictures brings printed, tabular data directly into Excel, where you can perform various kinds of analysis that are time-consuming or even impossible with pen and paper.
Going from analog to digital
Simply open the Excel app on your phone, snap a picture of your paper-based data table, crop and review the image, and you’re done-no technical skills required. The data table is automatically embedded and ready for analysis in Excel.
“If only I had this before, I’d have saved so much time avoiding manual data entry” is one very popular reaction of Excels so far. Going from analog to digital gives you time for more important tasks and is more secure. In addition, once a printed copy is converted, digital records are easier to organize and search-eliminating paper waste and reducing physical storage space.
Here are a few examples of how users can benefit from Insert Data from picture:
- Consolidate dozens, hundreds or even thousands of rows of paper-based data in a flash-all without a single pencil mark.
- Create illustrative charts and graphs to summarize information that was extremely difficult to communicate before.
- Use ideas in Excel to surface new trends and dependencies you might have missed when your data only existed on paper.
- Easily archive data documents for future reference and compliance purposes.
Inside Excel: the technology behind the feature
To enable seamless data extraction from an image, Insert Data from pictures reuses many of the same OCR and layout technologies previously released for Word.
First, the image is analyzed to detect the main building blocks of the document, like text and graphical elements (eg. table borders). For this, we further enhanced our Microsoft built OCR engine to handle images with scattered text, which is often the case in tabular data and leveraged image processing techniques to detect graphical elements.
Once the image is decomposed into main building blocks, Insert Dara from picture starts inferring the layout of the table. The most important part is detecting the grid of the table, which is done by generating grid candidates from horizontal and vertical lines (for bordered tables) and empty spaces between text (for borderless tables). After all the candidates are generated, the feature uses a combination of various heuristics and machine learning models to filter false positives and produce the final grid that will be reconstructed in the output. Producing that final grid relies on the analysis of each cell to build out other structures like paragraphs, font properties, and lists.
Artificial Intelligence (AI) for more accurate results
For those who have used CR-based features before, you know it doesn’t always get everything right. Insert Data from picture takes special care to highlight potential errors, so you can focus on individual entries rather than the whole thing. The good news is that this is an AI-powered feature designed for continuous improvement, meaning that the data accuracy will increase over time. To achieve this, we leveraged the collection of machine learning models where each model detects a specific case of misinterpreted content (e.g. missed or added characters).
Resource Credit | Microsoft 365