Mining and repurposing of raw data into new types of knowledge – 4
This is my fourth installment in a series on repurposing accumulated stores of raw data and of processed knowledge, and of the potential – and potential pitfalls of commoditizing and marketing this as information product (see Macroeconomics and Business, postings 50, 53 and 55 for parts 1-3.) I have at least briefly looked into the business of raw data and processes information as marketable commodity from the perspectives of the managing and selling business, the clients they sell this information access to, and the original sources of this data (see Part 1, Part 2 and Part 3 respectively.) And as a recurring theme I have repeatedly turned to the valuation of this information flow throughout that.
I have written in other series about the valuation of information per se (e.g. see my nine part series: Intelligence as a Quantitative Distinction as included in Macroeconomics and Business as postings 21 and 23-30.) My goal in this posting is to continue this series in progress, and in a real sense continue that series too. And my focus here is on the balance between risk and value in raw and processed data that would be commoditized for the marketplace – as for example by a business data repurposing business.
The basic axiomatic model (specifying the core assumptions that I bring to this discussion):
I start with what I will at least for this series posit as an axiomatic statement:
1. Any given unit of information achieves its maximum marketable value at any given time, place and circumstance where it can be transferred in that marketplace with zero risk to buyer, seller or to that information’s sources (collectively referred to here as the three participants or the three participant classes.)
And as a direct continuation of that I add axioms 2 and 3:
2. Any nonzero risk for any one or more of these participants decreases the maximum realizable marketplace value of that information.
3. Absent compensating factors this decrease is proportional to the total combined perceivable risk that this information carries across all three participant classes.
And to clarify a possible point not so far specifically stated:
4. The marketable value that any given unit of information could hold is whatever the maximum realized sales price it could achieve in the specific marketplace in which it is offered and at the time of sale minus any risk-based devaluation as specified by axioms one through three – it is simply what the market would bear minus risk expenses in determining net value.
I have cited risk in passing in all three of the preceding installments to this series, and I turn in this one to more directly address it as a complex of issues and considerations. And I start that, as a point of orientation, with at least partial lists of some of the types of risk factors that would enter into the evaluating and the valuation of information – raw data or processed knowledge and to each of the three participant classes.
First, consider risk from the perspective of the data source (as source per se where they could also be seller):
• For purposes of this posting “data source” includes several potentially, and even significantly separable groups and organizational levels. First, consider individual and individual family level end-users as marketplace purchasers of products and services. These and small and single employee businesses which I also add in here can be viewed as the most fundamental data sources and most of what might be gathered from this group would be completely raw, unprocessed data.
• Then consider how a larger purchasing business or other organizations might pool, meta-tag and analyze its raw data of this type, perhaps merging data related to purchase and use of resources coming from multiple suppliers and other sources and with entirely in-house generated data and processed knowledge from that added in.
• And when organizations purchase together to leverage greater buyer-side strength in the marketplace this can included pooled multiple data source content with all of that still considered original source.
Loss of control of personally identifiable information regarding individuals is one obvious area where improper release and sharing of raw data and source-identifiable processed data would be compromising and carry monetary risk. Identity theft is, however, just one of several possible ways in which this type of data can be misused at the avoidable, unnecessary expense of the data sources involved. But for organizations, a much wider range of data types might require confidential handling and to illustrate that point I cite a military supply chain example and one that might not come immediately to mind: data on the precise amounts of toilet paper, underarm deodorants and tubes of toothpaste provided through the commissaries and supply depots that service forward military bases. This type of data can be used to produce remarkably accurate estimates as to both troop strength and even troop readiness – information that holds obvious military intelligence and national defense value. For purposes of this posting I simply note that the more complex and comprehensive the data collected, the more can be done with it that might threaten or undermine the data sources involved.
Next consider risk from the perspective of the data access buyer:
Any threat, real or simply potential but perceived to the sources of this flow of data creates liability risks for any organization that would acquire it – access purchasers definitely included. This is an area where data aggregation as anonymized demographics level information becomes important – both that it reveal insight from the raw data and that it can serve as a firewall, limiting liability exposure from capturing or holding potential individual source-compromising details. I add here, however, that the more pre-processed the data shared, the more limited its potential use with potential for risk and loss from misinterpretation entering in. The closer to raw the data acquired, the fewer assumptions which might not be valid can be hidden in it.
Finally, consider risk from the perspective of the data broker or other third-party seller:
• If a broker accumulates business intelligence itself for repackaging and distribution from its own information management systems – its own servers and associated hardware and software, it assumes direct risk for any possible misuse of this data flow insofar as that might be linked to its agreements to sell or otherwise provide access.
• That broker also carries a burden of risk that their data accumulations might be compromised and improperly accessed too.
• If a broker simply acts as a go-between in helping source providers (or other brokers or source or third party sellers) to connect with buyers, they might not face direct risk from hacker access or misuse of the data they facilitate transfer of, but they will probably still carry significant indirect risk from being sued too as a crucially involved participant if anything goes wrong.
I wrote in my series Intelligence as a Quantitative Distinction (see Macroeconomics and Business as postings 21 and 23-30) about the importance of developing a comprehensive approach to determine the valuation of business intelligence as if it were more a standard commodity, free of ad hoc valuation. With this series and this posting in it, I note that one of the industries that would find greatest value in that and in having a consistent model for the replicable valuation of information would be insurance companies.
I will look more closely into that last point in my next series installment, noting in anticipation that insurance companies might be seen both as data repurposers and resellers in their own right, and as risk reducers for others. This, among other things creates potential conflict of interest and risk creation issues for these companies.
As an addendum note regarding the four axioms I presented above:
I intentionally left out of consideration when positing my four axioms, any operational or business costs that might arise limiting the realizable value that specific packages of business information or intelligence might yield. That is because my purpose in specifying those axioms was to set parameters for the valuation of this information per se. As such I intentionally left out factors that would more specifically determine the efficacy or even the viability of any given business or business model that might seek to commoditize and market information. In this and in counter-distinction I note that even when risk arises entirely as a result of internal-to-the-business practices, policies, decisions and outcomes, the fact of that risk and its level are determined outside of the organization and in the marketplace in which that business operates.
You can find this posting and series at Ubiquitous Computing and Communications – everywhere all the time and at Macroeconomics and Business. I specifically point out as directly applicable background reading Business Intelligence as a Qualitative Distinction – a requirement for effective rules of monetization and my nine part series: Intelligence as a Quantitative Distinction (see Macroeconomics and Business, postings 21 and 23-30.)