09:47 AM
State Street’s Chief Scientist on How to Tame Big Data Using Semantics
Financial institutions are accumulating data at a rapid pace. Between massive amounts of internal information and an ever-growing pool of unstructured data to deal with, banks' data management and storage capabilities are being stretched thin. But relief may come in the form of semantic databases, which could be the next evolution in how banks manage big data, says David Saul, Chief Scientist for Boston-based State Street Corp.
The semantic data model associates a meaning to each piece of data to allow for better evaluation and analysis, Saul notes, adding that given their ability to analyze relationships, semantic databases are particularly well-suited for the financial services industry.
"Our most important asset is the data we own and the data we act as a custodian for," he says. "A lot of what we do for our customers, and what they do with the information we deliver to them, is aggregate data from different sources and correlate it to make better business decisions."
Semantic technology, notes Saul, is based on the same technology "that all of us use on the World Wide Web, and that's the concept of being able to hyperlink from one location to another location. Semantic technology does the same thing for linking data."
Using a semantic database, each piece of data has a meaning associated with it, says Saul. For example, a typical data field might be a customer name. Semantic technology knows where that piece of information is in both the database and ununstructured data, he says. Semantic data would then allow for a financial institutions to create a report or dashboard that shows all of their interactions with that customer.
"The way it's done now, you write data extract programs and create a repository," he says. "There's a lot of translation that's required."
Semantic data can also be greatly beneficial for banks in conducting risk calculations for regulatory requirements, Saul adds.
"That is something regulators are constantly looking for us to do, they want to know what our total exposure is to a particular customer or geographic area," he says. "That requires quite a bit of development effort, which equals time and money. With semantic technology, once you describe the data sources, you can do that very, very quickly. You don't have to write new extract programs."
The technology to run semantic databases is based on standards developed by the World Wide Web Consortium, notes Saul, and while not widely used in financial services, semantic data models are prevalent in fields such as health care and academia.
State Street has begun to implement semantic databases on a limited basis and is "in the early stages," Saul says.
"With any new technology, clearly you want to make sure this is based on a very solid foundation," he adds. "The tools in order to do this are only just now becoming available. Our plan is to prove this out internally and then work with our customers."
Another benefit of semantic databases Saul points is is they can be done incrementally. "You can start out with two databases, describe them semantically and put them together, and then later add the third, fourth and fifth," he says.
As banks continue to try and aggregate data from an ever-growing amount of sources, Saul believes semantic data models will become essential in corralling big data going forward.
"As you get into the big data space, and now you're looking at more and more data, you need new tools," he notes.
Bryan Yurcan is associate editor for Bank Systems and Technology. He has worked in various editorial capacities for newspapers and magazines for the past 8 years. After beginning his career as a municipal and courts reporter for daily newspapers in upstate New York, Bryan has ... View Full Bio