Platt Perspective on Business and Technology

Reconsidering Information Systems Infrastructure 11

This is the 11th posting to a series that I am developing, with a goal of analyzing and discussing how artificial intelligence and the emergence of artificial intelligent agents will transform the electronic and online-enabled information management systems that we have and use. See Ubiquitous Computing and Communications – everywhere all the time 2 and its Page 3 continuation, postings 374 and loosely following for Parts 1-10. And also see two benchmark postings that I initially wrote just over six years apart but that together provided much of the specific impetus for my writing this series: Assumption 6 – The fallacy of the Singularity and the Fallacy of Simple Linear Progression – finding a middle ground and a late 2017 follow-up to that posting.

I conceptually divide artificial intelligence tasks and goals into three loosely defined categories in this series. And I have been discussing artificial intelligence agents and their systems requirements in a goals and requirements-oriented manner that is consistent with that, since Part 9 with those categorical types partitioned out from each other as follows:

• Fully specified systems goals and their tasks (e.g. chess with its fully specified rules defining a win and a loss, etc. for it),
• Open-ended systems goals and their tasks (e.g. natural conversational ability with its lack of corresponding fully characterized performance end points or similar parameter-defined success constraints), and
• Partly specified systems goals and their tasks (as in self-driving cars where they can be programmed with the legal rules of the road, but not with a correspondingly detailed algorithmically definable understanding of how real people in their vicinity actually drive and sometimes in spite of those rules: driving according to or contrary to the traffic laws in place.)

And I have focused up to here in this developing narrative on the first two of those task and goals categories, only noting the third of them as a transition category, where success in resolving tasks there would serve as a bridge from developing effective artificial specialized intelligence agents (that can carry out fully specified tasks and that have become increasingly well understood and both in principle and in practice) to the development of true artificial general intelligence agents (that can carry out open-ended tasks and that are still only partly understood for how they would be developed.)

And to bring this orienting starting note for this posting, up to date for what I have offered regarding that middle ground category, I add that I further partitioned that general category for its included degrees of task performance difficulty, in Part 10, according to what I identify as a swimming pool model:

• With its simpler, shallow end tasks that might arguably in fact belong in the fully specified systems goals and tasks category, as difficult entries there, and
• Deep end tasks that might arguably belong in the above-repeated open-ended systems goals and tasks category.

I chose self-driving vehicles and their artificial intelligence agent drivers as an intermediate, partly specified systems goal because it at least appears to belong in this category and with a degree of difficulty that would position it at least closer to the shallow end than the deep end there, and probably much closer.

Current self-driving cars have performed successfully (reaching their intended destinations and without accidents) and both in controlled settings and on the open road and in the presence of actual real-world drivers and their driving. And their guiding algorithms do seek to at least partly control for and account for what might be erratic circumambient driving on the part of others on the road around them, by for example allowing extra spacing between their vehicles and others ahead of them on the road. But even there, an “aggressive” human driver might suddenly squeeze into that space, and without signaling that they would change lanes, suddenly leaving a self-driving vehicle following too closely too. So this represents a task that might be encoded into a single if complex overarching algorithm, as supplemented by a priori sourced expert systems data and insight, based on real-world human driving behavior. But it is one that would also require ongoing self-learning and improvement on the part of the artificial intelligence agent drivers involved too, and both within these specific vehicles and between them as well.

• If all cars and trucks on the road were self-driving and all of them were actively sharing action and intention information with at least nearby vehicles in that system, all the time and real-time, self-driving would qualify as a fully specified systems task, and for all of the vehicles on the road. As soon as the wild card of human driving enters this narrative, that ceases to hold true. And the larger the percentage of human drivers actively on the road, the more statistically likely it becomes that one or more in the immediate vicinity of any given self-driving vehicle will drive erratically, making this a distinctly partly specified task challenge.

Let’s consider what that means in at least some detail. And I address that challenge by posing some risk management questions that this type of concurrent driving would raise, where the added risk that those drivers bring with them, move this out of a fully specified task category:

• What “non-standard” actions do real world drivers make?

This would include illegal lane changes, running red lights and stop signs, illegal turns, speeding and more. But more subtly perhaps, this would also include driving at, for example, a posted speed limit but under road conditions (e.g. in fog or during heavy rain) where that would not be safe.

• Are there circumstances where such behavior might arguably be more predictably likely to occur, and if so what are they and for what specific types of problematical driving?
• Are there times of the day, or other identifiable markers for when and where specific forms of problematical driving would be more likely?
• Are there markers that would identify problem drivers approaching, and from the front, the back or the side? Are there risk-predictive behaviors that can be identified before a possible accident, that a self-driving car and its artificial intelligence agent can look for and prepare for?
• What proactive accommodations could a self-driving car or truck make to lessen the risk of accident if, for example its sensors detect a car that is speeding and weaving erratically from lane to lane in the traffic flow, and without creating new vulnerabilities from how it would respond to that?

Consider, in that “new vulnerabilities” category, the example that I have already offered in passing above, when noting that increasing the distance between a self-driving car and a vehicle that is directly ahead of it, might in effect invite a third driver to squeeze in between them, and even if that meant it was now tailgating that leading vehicle and the self driving car that would now be behind it was tailgating it. A traffic light ahead, suddenly changing to red, or any other driving circumstance that would force the lead car in all of this to suddenly hit their brakes could cause a chain reaction accident.

What I am leading up to here in this discussion is a point that is simple to explain and justify in principle, even as it remains difficult to operationally resolve as a challenge in practice:

• With the difficulty in these less easily rules-defined challenges increasing, as the tasks that they would arise in it fit into deeper and deeper areas of that swimming pool in my above-cited analogy.

Fully specified systems goals and their tasks might be largely or even entirely deterministic in nature and rules determinable, where condition A always calls for action and response B, or at least a selection from among a specifiable set of particular such actions that would be chosen from, to meet the goals-oriented needs of the agent taking them. But partly specified systems goals and their tasks are of necessity significantly stochastic in nature, and with probabilistic evaluations of changing task context becoming more and more important as the tasks involved fit more and more into the deep end of that pool. And they become more open-endedly flexible in their response and action requirements too, no longer fitting cleanly into any given set of a priori if A then B rules.

Airplanes have had autopilot systems for years and even for human generations now, with the first of them dating back as far as 1912: more than a hundred years ago. But these systems have essentially always had human pilot back-up if nothing else, and have for the most part been limited to carrying out specific tasks, and under circumstances where the planes involved were in open air and without other aircraft coming too close. Self-driving cars have to be able to function in crowded roads and without human back-up – and even when a person is sitting behind the wheel, where it has to be assumed that they are not always going to be attentive to what the car or truck is doing, taking its self-driving capabilities for granted.

And with that noted, I add here that this is a goal that many are actively working to perfect, at least to a level of safe efficiency that matches the driving capabilities of an average safe driver on the road today. See, for example:

• The DARPA autonomous vehicle Grand Challenge, and
• Burns, L.D. and C Shulgan (2018) Autonomy: the quest to build the driverless car and how that will reshape the world. HarperCollins.

I am going to continue this discussion in a next series installment where I will turn back to reconsider open-ended goals and their agents again, and more from a perspective of general principles. Meanwhile, you can find this and related postings and series at Ubiquitous Computing and Communications – everywhere all the time and its Page 2 and Page 3 continuations. And you can also find a link to this posting, appended to the end of Section I of Reexamining the Fundamentals as a supplemental entry there.

Moore’s law, software design lock-in, and the constraints faced when evolving artificial intelligence 8

Posted in business and convergent technologies, reexamining the fundamentals by Timothy Platt on August 22, 2019

This is my 8th posting to a short series on the growth potential and constraints inherent in innovation, as realized as a practical matter (see Reexamining the Fundamentals 2, Section VIII for Parts 1-7.) And this is also my fifth posting to this series, to explicitly discuss emerging and still forming artificial intelligence technologies as they are and will be impacted upon by software lock-in and its imperatives, and by shared but more arbitrarily determined constraints such as Moore’s law (see Parts 4-7.)

I focused, for the most part in Part 7 of this series, on offering what amount to analogs to the simplified assumption Thévenin circuits of electronic circuit design. Thévenin’s theorem and the simplified and even detail-free circuits that they specify, serve to calculate and mimic the overall voltage and resistance parameters for what are construed to be entirely black-box electronic systems with their more complex circuitry, the detailed nature of which are not of importance in that type of analysis. There, the question is not one of what that circuitry specifically does or how, but rather of how it would or would not be able to function with what overall voltage and resistance requirements and specifications in larger systems.

My simplified assumption representations of Part 7 treated both brain systems and artificial intelligence agent systems as black box entities and looked at general timing and scale parameters to both determine their overall maximum possible size, and therefore their maximum overall complexity, given the fact that any and all functional elements within them would have larger than zero minimum volumes, and well as minimal time-to-task-completion requirements for what they would do. And I offered my Part 7 analyses there as first step evaluations of these issues, that of necessity would require refinement and added detail to offer anything like actionable value. Returning briefly to consider the Thévenin equivalents that I just cited above, by way of analogous comparison here, the details of the actual circuits that would be simplistically modeled there might not be important to or even germane to the end-result Thévenin circuits arrived at, but those simplest voltage and resistance matching equivalents would of necessity include within them, the cumulative voltage and resistance parameters of all of that detail in those circuit black boxes, even if as they would be rolled into overall requirement summaries for those circuits as a whole.

My goal for this posting is to at least begin to identify and discuss some of the complexity that would be rolled into my simplified assumptions models, and in a way that matches how a Thévenin theorem calculation would account for internal complexity in its modeled circuits’ overall electrical activity and requirements calculations, but without specifying their precise details either. And I begin by considering the functional and structural nodes that I made note of in Part 7 and in both brain and artificial intelligence agent contexts, and the issues of single processor versus parallel processing systems and subsystems. And this, of necessity means considering the nature of the information processing problems to be addressed by these systems too.

Let’s start this by considering the basic single processor paradigm and Moore’s law, and how riding that steady pattern of increased circuit complexity in any given overall integrated circuit chip size, has led to capacity to perform more complex information processing tasks and to do so with faster and faster clock speeds. I wrote in Part 7 of the maximum theoretical radius of a putative intelligent agent or system: biological and brain base, or artificial and electronic in nature, there assuming that a task could be completed, as a simplest possibility just by successfully sending a single signal at the maximum rate achievable in that system, in a straight line and for a period of time that is nominally assumed necessary to complete a task there. Think of increased chip/node clock speed here, as an equivalent of adding allowance for increased functional complexity into what would actually be sent in that test case signal, or in any more realistic functional test counterparts to it. The more that a processor added into this as an initial signal source, can do in a shorter period of time, in creating meaningful and actionable signal content to be so transmitted, the more functionally capable the agent, or system that includes it can be and still maintain a set maximum overall physical size.

Parallelism there, can be seen as a performance multiplier: as an efficiency and speed multiplier there, and particularly when that can be carried out within a set, here effectively standardized volume of space so as to limit the impact of maximum signal speeds in that initial processing as a performance degrader. Note that I just modified my original simplest and presumably fastest and farthest physically reaching, maximum size allowing example from Part 7 by adding in a signal processor and generator at its starting point. And I also at least allow for a matching node at the distant signal receiving end of this too, where capacity to do more and to add in more signal processing capacity at one or both ends of this transmission, and without increase in timing requirements for that added processing overhead, would not reduce the effective maximum physical size of such a system, in and of itself.

Parallel computing is a design approach that specifically, explicitly allows for such increased signal processing capacity, and at least in principle without necessarily adding in new timing delays and scale limitations – and certainly if it can be carried out within a single chip, that fits within the scale footprint of whatever single processor chip that it might be benchmark compared to.

I just added some assumptions into this narrative that demand acknowledging. And I begin doing so here by considering two types of tasks that are routinely carried out by biological brains, and certainly by higher functioning ones as would be found in vertebrate species: vision and the central nervous system processing that enters into that, and the information processing that would enter into carrying out tasks that cannot simply be handled by reflex and that would as such, call for more novel and even at least in-part, one-off information processing and learning.

Vision, as a flow of central nervous system and brain functions, is an incredibly complex process flow that involves pattern recognition and identification and a great deal more. And it can be seen as a quintessential example of a made for parallel processing problem, where an entire visual field can be divided into what amounts to a grid pattern that maps input data arriving at an eye to various points on an observer’s retina, and where essentially the same at least initial processing steps would be called for, for each of those data reception areas and their input data.

I simplify this example by leaving specialized retinal areas such as the fovea out of consideration, with its more sharply focused, detail-rich visual data reception and the more focused brain-level processing that would connect to that. The more basic, standardized model of vision that I am offering here, applies to the data reception and processing for essentially all of the rest of the visual field of a vertebrate eye and for its brain-level processing. (For a non-vision comparable computer systems example of a parallel computing-ready problem, consider the analysis of seismic data as collected from arrays of ground-based vibration sensors, as would be used to map out anything from the deep geological features and structures associated with potential petrochemical deposits, or the mapping details of underground fault lines that would hold importance in a potential earthquake context, or that might be used to distinguish between a possible naturally occurring earthquake and a below-ground nuclear weapons test.)

My more one-off experience example and its information processing might involve parallel processing and certainly when comparing apparent new with what is already known of and held in memory, as a speed-enhancing approach, to cite one possible area for such involvement. But the core of this type of information processing task and its resolution is likely to be more specialized, non-parallel processor or equivalent-driven.

And this brings me specifically and directly to the question of problem types faced: of data processing and analysis types and how they can best be functionally partitioned algorithmically. I have in a way already said in what I just wrote here, what I will more explicitly make note of now when addressing that question. But I will risk repeating the points that I made on this by way of special case examples, as more general principles, for purposes of increased clarity and focus and even if that means my being overly repetitious:

• The perfect parallel processing-ready problem is one that can be partitioned into a large and even vastly large set of what are essentially identical, individually simpler processing problems, where an overall solution to the original problem as a whole calls for carrying out all of those smaller t standardized sub-problems and stitching their resolutions together into a single organized whole. This might at times mean fully resolving the sub-problems and then combining them into a single organized whole, but more commonly this means developing successive rounds of preliminary solutions for them and repeatedly bringing them together, where adjacent parallel processing cells in this, serve as boundary value input for their neighbors in this type of system (see cellular automation for a more extreme example of how that need and its resolution can arise.)
• Single processor, and particularly computationally powerful single processor approaches become more effective, and even fundamentally essential as soon as problems arise that need comprehensive information processing that cannot readily be divided up into arrays of similarly structured simpler sub-problems that the individual smaller central processing units, or their biological equivalents, could separately address in parallel with each other, as is the case in my vision example or one of my non-vision computer systems examples as just given.

And this leads me to two open questions:

• What areas and aspects of artificial intelligence, or of intelligence per se, can be parsed into sub-problems that would make parallel processing both possible, and more efficient than single processor computing might allow?
• And how algorithmically, can problems in general be defined and specified, so as to effectively or even more optimally make this type of determination, so that they can be passed onto the right types and combinations of central processor or equivalent circuitry for resolution? (Here, I am assuming generally capable computer architectures that can address more open-ended ranges of information processing problems: another topic area that will need further discussion in what follows.)

And I will of course, discuss all of these issues from the perspective of Moore’s law and its equivalents and in terms of lock-in and its limiting restrictions, at least starting all of this in my nest installment to this series.

The maximum possible physical size test of possible or putative intelligence-supporting systems, as already touched upon in this series, is only one way to parse such systems at a general outer-range parameter defining level. As part of the discussion to follow from here, I will at least briefly consider a second such approach, that is specifically grounded in the basic assumptions underlying Moore’s law itself: that increasing the number of computationally significant elements (e.g. the number of transistor elements in an integrated circuit chip), can and will increase the scale of a computational or other information processing problem that that physical system can resolve within any single set period of time. And that, among other things will mean discussing a brain’s counterparts to the transistors and other functional elements of an electronic circuit. And in anticipation of that discussion to come, this will mean discussing how logic gates and arrays of them can be assembled from simpler elements, and both statically and dynamically.

Meanwhile, you can find this and related material at Ubiquitous Computing and Communications – everywhere all the time 3 and also see Page 1 and Page 2 of that directory. And I also include this in my Reexamining the Fundamentals 2 directory as topics Section VIII. And also see its Page 1.

Some thoughts concerning a general theory of business 30: a second round discussion of general theories as such, 5

Posted in blogs and marketing, book recommendations, reexamining the fundamentals by Timothy Platt on August 16, 2019

This is my 30th installment to a series on general theories of business, and on what general theory means as a matter of underlying principle and in this specific context (see Reexamining the Fundamentals directory, Section VI for Parts 1-25 and its Page 2 continuation, Section IX for Parts 26-29.)

I began this series in its Parts 1-8 with an initial orienting discussion of general theories per se, with an initial analysis of compendium model theories and of axiomatically grounded general theories as a conceptual starting point for what would follow. And I then turned from that, in Parts 9-25 to at least begin to outline a lower-level, more reductionistic approach to businesses and to thinking about them, that is based on interpersonal interactions. Then I began a second round, next step discussion of general theories per se in Parts 26-29 of this, building upon my initial discussion of general theories per se, this time focusing on axiomatic systems and on axioms per se and the presumptions that they are built upon. As a key part of that continued narrative, I offered a point of theory defining distinction in Part 28, that I began using there in this discussion, and that I continued using in Part 29 as well, and that I will continue using and developing here too, drawing a distinction between:

• Entirely abstract axiomatic bodies of theory that are grounded entirely upon sets of a priori presumed and selected axioms. These theories are entirely encompassed by sets of presumed fundamental truths: sets of axiomatic assumptions, as combined with complex assemblies of theorems and related consequential statements (lemmas, etc) that can be derived from them, as based upon their own collective internal logic. Think of these as axiomatically closed bodies of theory.
• And theory specifying systems that are axiomatically grounded as above, with at least some a priori assumptions built into them, but that are also at least as significantly grounded in outside-sourced information too, such as empirically measured findings as would be brought in as observational or experimental data. Think of these as axiomatically open bodies of theory.

And I have, and will continue to refer to them as axiomatically closed and open bodies of theory, as convenient terms for denoting them. And that brings me up to the point in this developing narrative that I would begin this installment to it at, with two topics points that I would discuss in terms of how they arise in closed and open bodies of theory respectively:

• How would new axioms be added into an already developing body of theory, and how and when would old ones be reframed, generalized, limited for their expected validity and made into special case rules as a result, or be entirely discarded as organizing principles there per se.
• Then after addressing that set of issues I said that I will turn to consider issues of scope expansion for the set of axioms assumed in a given theory-based system, and with a goal of more fully analytically discussing optimization for the set of axioms presumed, and what that even means.

I began discussing the first of these topics points in Part 29 and will continue doing so here. And after completing that discussion thread, at least for purposes of this digression into the epistemology of general theories per se, I will turn to and discuss the second of those points too. And I begin addressing all of this at the very beginning, with what was arguably the first, at least still-existing effort to create a fully complete and consistent axiomatically closed body of theory that would address what was expected at least, to encompass and resolve all possible problems and circumstances where it might conceivably be applied: Euclid’s geometry as developed from the set of axiomatically presumed truths that he built his system upon.

More specifically, I begin this narrative thread with Euclid’s famous, or if you prefer infamous Fifth postulate: his fifth axiom, and how that defines and constrains the concept of parallelism. And I begin here by noting that mathematicians and geometers began struggling with it more than two thousand years ago, and quite possibly from when Euclid himself was still alive.

Unlike the other axioms that Euclid offered, this one did not appear to be self-evident. So a seemingly endless progression of scholars sought to find a way to prove it from the first four of Euclid’s axioms. And baring that possibility, scholars sought to develop alternative bodies of geometric theory that either offered alternative axioms to replace Euclid’s fifth, or that did without parallelism as an axiomatic principle at all, or that explicitly focused on it and even if that meant dispensing with the metric concepts of angle and distance (where parallelism can be defined independently of them), with affine geometries.

In an axiomatically closed body of theory context, this can all be thought of as offering what amounts to alternative realities, and certainly insofar as geometry is applied for its provable findings, to the empirically observable real world. The existence of a formally, axiomatically specified non-Euclidean geometry such as a an elliptic or hyperbolic geometry that explicitly diverge from the Euclidean on the issue of parallelism, does not disprove Euclidean geometry, or even necessarily refute it except insofar as their existence shows that other equally solidly grounded, axiomatically-based geometries are possible too. So as long as a set of axioms that underlie a body of theory such as one of these geometries can be assumed to be internally consistent, the issues of reframing, generalizing, limiting or otherwise changing axioms in place, within a closed body of theory is effectively moot.

As soon as outside-sourced empirical or other information is brought in that arises separately from and independently from the set, a priori axioms in place in a body of theory, all of that changes. And that certainly holds if such information (e.g. replicably observed empirical observations and the data derived from them) is held to be more reliably grounded and “truer” than data arrived at entirely as a consequence of logical evaluation of the consequences of the a priori axioms in place. (Nota bene: Keep in mind that I am still referring here to initial presumed axioms that are not in and of themselves directly empirically validated, and that might never have even been in any way tested against outside-sourced observations and certainly for the range of observation types that that perhaps new forms of empirical data and its observed patterns might offer. Such new data might in effect force change in previously assumed axiomatically framed truth.)

All I have done in the above paragraph is to somewhat awkwardly outline the experimental method, where theory-based hypotheses are tested against carefully developed and analyzed empirical data to see if it supports or refutes them. And in that, I focus in the above paragraph, on experimental testing that would support or refute what have come to be seen as really fundamental, underlying principles and not just detail elaborations as to how the basic assumed principles in place would address very specific, special circumstances.

But this line of discussion overlooks, or at least glosses over a very large gap in the more complete narrative that I would need to address here. And for purposes of filling that gap, I return to reconsider Kurt Gödel and his proofs of the incompleteness of any axiomatic theory of arithmetic, and of the impossibility of proving absolute consistency for such a body of theory too, as touched upon here in Part 28. As a crude representation of a more complex overall concept, mathematical proofs can be roughly divided into two basic types:

Existence proofs, that simply demonstrate that at least one mathematical construct exists within the framework of a set of axioms under consideration that would explicitly sustain or refute that theory, but without in any way indicating its form or details, and
Constructive proofs, that both prove the existence of a theorem-supporting or refuting mathematical construct, and also specifically identify and specify it for at least one realistic example, or at least one realistic category of such examples.

Gödel’s inconsistency theorem is an existence proof insofar as it does not constructively indicate any specific mathematical contexts where inconsistency explicitly arises. And even if it did, that arguably would only indicate where specific changes might be needed in order to seamlessly connect two bodies of mathematical theory: A and B, within a to-them, sufficiently complete and consistent single axiomatic framework so as to be able to treat them as a single combined area of mathematics (e.g. combining algebra and geometry to arrive as a larger and more inclusive body of theory such as algebraic geometry.) And this brings me very specifically and directly to the issues of reverse mathematics, as briefly but very effectively raised in:

• Stillwell, J. (2018) Reverse Mathematics: proofs from the inside out. Princeton University Press.

And I at least begin to bring that approach into this discussion by posing a brief set of very basic questions, that arise of necessity from Gödel’s discoveries and the proof that he offered to validate them:

• What would be the minimum set of axioms, demonstrably consistent within that set, that would be needed in order to prove as valid, some specific mathematical theorem A?
• What would be the minimum set of axioms needed to so prove theorem A and also theorem B (or some other explicitly stated and specified finitely enumerable set of such theorems A, B, C etc.)?

Anything in the way of demonstrable incompleteness of a type required here, for bringing A and B (and C and …, if needed) into a single overarching theory would call for a specific, constructively demonstrable expansion of the set of axioms assumed in order to accomplish the goals implicit in those two bullet pointed questions. And any demonstrable inconsistency that were to emerge when seeking to arrive at such a minimal necessary axiomatic foundation for a combined theory, would of necessity call for a reframing or a replacement at a basic axiomatic level and even in what are overtly closed axiomatic bodies of theory. So Euclidean versus non-Euclidean geometries notwithstanding, even a seemingly completely closed such body of theory might need to be reconsidered and axiomatically re-grounded, or discarded entirely.

I am going to continue this line of discussion in a next series installment, where I will turn to more explicitly consider axiomatically open bodies of theory in this context. And in anticipation of that narrative to come, I will consider:

• The emergence of both disruptively new types of data and of empirical observations that could generate it,
• And shifts in the accuracy resolution, or the range of observations that more accepted and known types of empirical observations might suddenly be offering.

I add here that I have, of necessity, already begun discussing the second to-address topic point that I made note of towards the start of this posting:

• Scope expansion for the set of axioms assumed in a given theory-based system, and with a goal of more fully analytically discussing optimization for the set of axioms presumed, and what that even means.

I will continue on in this overall discussion to more fully consider that set of issues, and certainly where optimization is concerned in this type of context.

Meanwhile, you can find this and related material about what I am attempting to do here at About this Blog and at Blogs and Marketing. And I include this series in my Reexamining the Fundamentals directory and its Page 2 continuation, as topics Sections VI and IX there.

Reconsidering Information Systems Infrastructure 10

Posted in business and convergent technologies, reexamining the fundamentals by Timothy Platt on June 20, 2019

This is the 10th posting to a series that I am developing, with a goal of analyzing and discussing how artificial intelligence and the emergence of artificial intelligent agents will transform the electronic and online-enabled information management systems that we have and use. See Ubiquitous Computing and Communications – everywhere all the time 2 and its Page 3 continuation, postings 374 and loosely following for Parts 1-9. And also see two benchmark postings that I initially wrote just over six years apart but that together provided much of the specific impetus for my writing this series: Assumption 6 – The fallacy of the Singularity and the Fallacy of Simple Linear Progression – finding a middle ground and a late 2017 follow-up to that posting.

I have been discussing artificial intelligence agents from a variety of perspectives in this series, turning in Part 9 for example, to at least briefly begin a discussion of neural network and related systems architecture approaches to hardware and software development in that arena. And my goal in that has been to present a consistently, logically organized discussion of a very large and still largely amorphous complex of issues, that in their simplest case implementations are coming to be more fully understood, but that are still open and largely undefined when moving significantly beyond that.

We now have a fairly good idea as to what artificial specialized intelligence is and certainly when that can be encapsulated into rigorously defined starter algorithms and with tightly constrained self-learning capabilities added in, that would primarily just help an agent to “random walk” its way towards greater efficiency in carrying out its specifically defined end-goal tasks. But in a fundamental sense, we are still in the position of standing as if at the edge of an abyss of yet to acquire knowledge and insight, when it comes to dealing with genuinely open-ended tasks such as natural conversation, and the development of artificial agents that can master them.

I begin this posting by reiterating a basic paradigmatic approach that I have offered in other information technology development contexts, and both in this blog and as a consultant, that explicitly applies here too.

• Start with the problem that you seek to solve, and not with the tools that you might use in accomplishing that.

Start with the here-artificial intelligence problem itself that you seek to effectively solve or resolve: the information management and processing task that you seek to accomplish, and plan and design and develop from there. In a standard if perhaps at least somewhat complex-problem context and as a simple case ideal, this means developing an algorithm that would encapsulate and solve a specific, clearly stated problem in detail, and then asking necessary questions as they arise at the software level and then the hardware level, to see what would be needed to carry that out. And ultimately that will mean selecting, designing and building at the hardware level for data storage and accessibility, and for raw computational power requirements and related capabilities that would be needed for this work. And at the software level this would mean selecting programming languages and related information encoding resources that are capable of encoding the algorithm in place and that can manage its requisite data flows as it is carried out. And it means actually encoding all of the functionalities required in that algorithm, in those software tools so as to actually perform the task that it specifies. (Here, I presume in how I state this, as a simplest case scenario, a problem that can in fact be algorithmically defined up-front and without any need for machine learning and algorithm adjustment as better and best solutions are iteratively found for the problem at hand. And I arbitrarily represent the work to be done there as fitting into what might in fact be a very large and complex “single overall task”, and even if carrying it out might lead to very different outcomes depending on what decision points have to be included and addressed there and certainly at a software level. I will, of course, set aside these and several other similar more-simplistic assumptions as this overall narrative proceeds and as I consider the possibilities of more complex artificial intelligence challenges. But I offer this simplified developmental model approach here, as an initial starting point for that further discussion to come.)

• Stepping back to consider the design and development approach that I have just offered here, if just in a simplest application form, this basic task-first and hardware detail-last approach can be applied to essentially any task, problem or challenge that I might address here in this series. I present that point of judgment on my part as an axiomatic given and even when ontological and even evolutionary development, as self-organized and carried out by and within artificial agents carrying out this work, is added into the basic design capabilities developed. There, How details might change but overall Towards What goals would not necessarily do so, unless the overall problem to be addressed in changed or replaced.

So I start with the basic problem-to-software-to-hardware progression that I began this line of discussion with, and continue building from there with it, though with a twist and certainly for artificial intelligence oriented tasks that are of necessity going to be less settled up-front as to their precise algorithms as would ultimately be required. I step back from my more firmly stated a priori assumptions as explicitly outlined above in my simpler case problem solving scenario, that I would continue to assume and pursue as-is in more standard computational or data processing task-to-software-to-hardware computational systems analyses, and certainly where off the shelf resources would not suffice, to add another level of detail there.

• And more specifically here, I argue a case for building flexibility into these overall systems and with the particular requirements that that adds to the above development approach.
• And I argue a case for designing and developing and building overall systems – and explicitly conceived artificial intelligence agents in particular, with an awareness of a need for such flexibility in scale and in design from their initial task specifications step in this development process, and with more and more room for adjustment and systems growth added in, and for self-adjustment within these systems added in for each successive development step as carried out from there too.

I focused in Part 9 on hardware, and on neural network designs and their architecture, at least as might be viewed from a higher conceptual perspective. And I then began this posting by positing in effect, that starting with the hardware and its considerations might be compared to looking through a telescope – but backwards. And I now say that a prospective awareness of increasing resource needs, with next systems-development steps is essential. And that understanding needs to enter into any systems development effort as envisioned here, and from the dawn of any Day 1 in developing and building towards it. This flexibility and its requisite scope and scale change requirements, I add, cannot necessarily be anticipated in advance of its actually being needed, and at any software or hardware level, and certainly not in any detail. So I write here of what might be called flexible flexibility: flexibility that itself can be adjusted and updated for type and scope as changing needs and new forms of need arise. So on the face of things, this sounds like I have now reversed course here and that I am arguing a case for hardware then software then problem as an orienting direction of focused consideration, or at the very least hardware plus software plus problem as a simultaneously addressed challenge. There is in fact an element of truth to that final assertion, but I am still primarily just adding flexibility and capacity to change directions of development as needed, into what is still basically a same settled paradigmatic approach. Ultimately, the underlying problem to be resolved has to take center stage and the lead here.

And with that all noted and for purposes of narrative continuity from earlier installments to this series if nothing else, I add that I ended Part 9 by raising a tripartite point of artificial intelligence task characterizing distinction, that I will at least begin to flesh out and discuss here:

• Fully specified systems goals (e.g. chess rules as touched upon in Part 8 for an at least somewhat complex example, but with fully specified rules defining a win and a loss, etc. for it.),
• Open-ended systems goals (e.g. natural conversational ability as more widely discussed in this series and certainly in its more recent installments with its lack of corresponding fully characterized performance end points or similar parameter-defined success constraints), and
• Partly specified systems goals (as in self-driving cars where they can be programmed with the legal rules of the road, but not with a correspondingly detailed algorithmically definable understanding of how real people in their vicinity actually drive and sometimes in spite of those rules: driving according to or contrary to the traffic laws in place.)

My goal here as noted in Part 9, is to at least lay a more detailed foundation for focusing on that third, gray area middle-ground task category in what follows, and I will do so. But to explain why I would focus on that and to put this step in this overall series narrative into clearer perspective, I will at least start with the first two, as benchmarking points of comparison. And I begin that with fully specified systems and with the very definite areas of information processing flexibility that they still can require – and with the artificial agent chess grand master problem.

• Chess is a rigorously definable game as considered at an algorithm level. All games as properly defined involve two players. All involve functionally identical sets of game pieces and both for numbers and types of pieces that those players would start out with. All chess games are played on a completely standardized game board with opposing side pieces positioned to start in a single standard accepted pattern. And opposing players take turns moving pieces on that playing board, with rules in place that would determine who is to make the first move, going first in any given game played.
• The chess pieces that are employed in this all have specific rules associated with them as to how they can be moved on a board, and for how pieces can be captured and removed by an opposing player. And chess games proceed until a player sees that they are one move away from being able to win in which case they declare “check.” Winning by definition for chess always means capturing an opposing player’s king piece. And when they win and with the determination of a valid win fully specified, they declare “checkmate.” And if a situation arises in which both players realize that a definitive formal win cannot be achieved in a finite number of moves from how the pieces that remain in play are laid out in the board, preventing one player from being able to capture their opponent’s king piece and winning, a draw is called.
• I have simplified this description for a few of the rules possibilities that enter into this game when correctly played, omitting a variety of at least circumstantially important details. But bottom line, the basic How of playing chess is fully and readily amenable to being specified within a single highly precise algorithm that can be in place and in use a priori to the actual play of any given chess game.
• Similar algorithmically defined specificity could be offered in explaining a much simpler game: tic-tac toe with its simple and limited range of moves and move combinations. Chess rises to the level of complexity and the level of interest that would qualify it for consideration here because of the combinatorial explosion in the number of possible distinct games of chess that can be played, each carrying out an at least somewhat distinct combination of moves when compared with any other of the overall set. All games start out the same with all pieces identically positioned. After the first set of moves with each player moving once, there are 400 distinct board setups possible with 20 possible white piece moves and 20 possible black piece moves. After two rounds of moves there are 197,742 possible board layouts and after three, that number expands out further to over 121 million. This range of possibilities arises at the very beginning of any actual game with the numbers of moves and of board layouts continuing to expand from there, and with the overall number of moves and move combinations growing to exceed and even vastly exceed the number of board position combinations possible, as differing move patterns can converge on same realized board layouts. And this is where strategy and tactics enter chess and in ways that would be meaningless for a game such as tic-tac toe. And this is where the drive to develop progressively more effective chess playing algorithm-driven artificial agents enters this too, where those algorithms would just begin with the set rules of chess and extend out from there to include tactical and strategic chess playing capabilities as well – so agents employing them can play strongly competitive games and not just by-the-rules, “correct” games.

So when I offer fully specified systems goals as a task category above, I assume as an implicit part of its definition that the problems that it would include all involve enough complexity so as to prove interesting, and that they be challenging to implement and certainly if best possible execution of the specific instance implementations involved in them (e.g. of the specific chess games played) is important. And with that noted I stress that for all of this complexity, the game itself is constrainable within a single and unequivocal rules-based algorithm, and even when effective strategically and tactically developed game play would be included.

That last point is going to prove important and certainly as a point of comparison when considering both open-ended systems goals and their so-defined tasks, and partly specified systems goals and their tasks. And with the above offered I turn to the second basic benchmark that I would address here: open-ended systems goals. And I will continue my discussion of natural conversation in that regard.

I begin with what might be considered simple, scale of needed activity-based complexity and the numbers of chess pieces on a board, and on one side of it in particular, when compared to the number of words as commonly used in wide-ranging conversation, in real-world natural conversation. Players start out with 16 chess pieces each and with fewer functionally identical game piece types than that; if you turn to resources such as the Oxford English Dictionary to benchmark English for its scale as a widely used language, it lists some 175,000 currently used words and another roughly 50,000 that are listed as obsolete but that are still at least occasionally used too. And this leaves out a great many specialized terms that would only arise when conversing about very specific and generally very technical issues. Assuming that an average person might in fact only actively use a fraction of this: let’s assume some 20,000 words on a more ongoing basis, that still adds tremendous new levels of complexity to any task that would involve manipulating and using them.

• Simple complexity of the type addressed there, can perhaps best be seen as an extraneous complication here. The basic algorithm-level processing of a larger scale piece-in-play set, as found in active vocabularies would not necessarily be fundamentally affected by that increase in scale beyond a requirement for better and more actively engaged sorting and filtering and related software as what would most probably be more ancillary support functions. And most of the additional workload that all of this would bring with it would be carried out by scaling up the hardware and related infrastructure that would carry out the conversational tasks involved and certainly if a normal rate of conversational give and take is going to be required.
• Qualitatively distinctive, emergently new requirements for actually specifying and carrying out natural conversation would come from a very different direction, that I would refer to here as emergent complexity. And that arises in the fundamental nature of the goal to be achieved itself.

Let’s think about conversation and the actual real-world conversations that we ourselves enter into and every day. Many are simple and direct and focus on the sharing of specific information between or concerning involved parties. “Remember to pick up a loaf of bread and some organic lettuce at the store, on the way home today.” “Will do, … but I may be a little late today because I have a meeting that might run late at work that I can’t get out of. I’ll let you know if it looks like I am going to be really delayed from that. Bread and lettuce are on the way so that shouldn’t add anything to any delays there.”

But even there, and even with a brief and apparently focused conversation like this, a lot of what was said and even more of what was meant and implied, depended on what might be a rich and complex background story, and with added complexities there coming from both of the two people speaking. And they might individually be hearing and thinking through this conversation in terms of at least somewhat differing background stories at that. What, for example, does “… be a little late today” mean? Is the second speaker’s boss, or whoever is calling this meeting known for doing this, and disruptively so for the end of workday schedules of all involved? Does “a little” here mean an actual just-brief delay or could this mean everyone in the room feeling stressed for being held late for so long, and with that simply adding to an ongoing pattern? The first half of this conversation was about getting more bread and lettuce, but the second half of it, while acknowledging that and agreeing to it, was in fact very different and much more open-ended for its potential implied side-messages. And this was in fact a very simple and very brief conversation.

Chess pieces can make very specific and easily characterized moves that fit into specific patterns and types of them. Words as used in natural conversations cannot be so simply characterized, and conversations – and even short and simple ones, often fit into larger ongoing contexts, and into contexts that different participants or observers might see very differently. And this is true even if none of the words involved have multiple possible dictionary definition meanings, if none of them can be readily or routinely used in slang or other non-standard ways, and if none of them have matching homophones – if there is not confusion as to precisely which word was being used, because two or more that differ by definition sound the same (e.g. knight or night, and to, too or two.)

And this, for all of its added complexities, does not even begin to address issues of euphemism, or agendas that a speaker might have with all of the implicit content and context that would bring to any conversation, or any of a wide range of other possible issues. It does not even address the issues of accent and its accurate comprehension. But more to the point, people can and do converse about any and every of a seemingly entirely open-ended range of topics and issues, and certainly when the more specific details addressed are considered. Jut consider the conversation that would take place if the shopper of the above-cited chat were to arrive home with a nice jar of mayonnaise and some carrots instead of bread and lettuce, after assuring that they knew what was needed and saying they would pick it up at the store. Did I raise slang here, or dialect differences? No, and adding them in here still does not fully address the special combinatorial explosions of meaning at least potentially expressed and at least potentially understood that actual wide-ranging open ended natural conversation brings with it.

And all of this brings me back to the point that I finished my above-offered discussion of chess with, and winning games in it as an example of a fully specified systems goal. Either one of the two players in a game of chess wins and the other loses, or they find themselves having to declare a draw for being unable to reach a specifically, clearly, rules-defined win/lose outcome. So barring draws that might call of another try that would at least potentially reach a win and loss, all chess games if completed, lead to a single defined outcome. But there is no single conversational outcome that would meaningfully apply to all situations and contexts, all conversing participants and all natural conversation – unless you were to attempt to arrive at some overall principle that would of necessity be so vague and general as to be devoid of any real value. Open-ended systems goals, as the name implies, are open-ended. And a big part of developing and carrying through a realistic sounding natural conversational capability in an artificial agent has to be that of keeping it in focus in a way that is both meaningful and acceptable to all involved parties, where that would mean knowing when a conversation should be concluded and how, and in a way that would not lead to confusion or worse.

And this leads me – finally, to my gray area category: partly specified systems goals and the tasks and the task performing agents that would carry them out and on a specific instance by specific instance basis and in general. My goal for what is to follow now, is to start out by more fully considering my self-driving car example, then turning to consider partly specified systems goals and the agents that would carry out tasks related to them, in general. And I begin that by making note of a crucially important detail here:

• Partly specified systems goals can be seen as gateway and transitional challenges, and while solving them at a practical matter can be important in and of itself,
• Achieving effective problem resolutions there can perhaps best be seen as a best practices route for developing the tools and technologies that would be needed for better resolving open-ended systems challenges too.

Focusing on the learning curve potential of these challenge goals, think of the collective range of problems that would fit into this mid-range task set as taking the overall form of a swimming pool with a shallow and a deep end, and where deep can become profoundly so. On the shallow end of this continuum-of-challenge degree, partly specified systems merge into the perhaps more challenging end of fully specified systems goals and their designated tasks. So as a starting point, let’s address low-end, or shallow end partly specified artificial intelligence challenges. At the deeper end of this continuum, it would become difficult to fully determine if a proposed problem should best be considered partly specified or open-ended in nature, and it might in fact start out designated one way to evolve into the other.

I am going to continue this narrative in my next installment to this series, starting with a more detailed discussion of partly specified systems goals and their agents as might be exemplified by my self-driving car problem/example. I will begin with a focus on that particular case in point challenge and will continue from there to consider these gray area goals and their resolution in more general terms, and both in their own right and as evolutionary benchmark and validation steps that would lead to carrying out those more challenging open-ended tasks.

In anticipation of that line of discussion to come and as an opening orienting note for what is to come in Part 11 of this series, I note a basic assumption that is axiomatically built into the basic standard understanding of what an algorithm is: that all step by step process flows as carried out in it, would ultimately lead to or at least towards some specific at least conceptually defined goal. (I add “towards” there to include algorithms that for example seek to calculate the value of the number pi (π) to an arbitrarily large number of significant digits where complete task resolution is by definition going to be impossible for that. And for a second type of ongoing example, consider an agent that would manage and maintain environmental conditions such as atmospheric temperature and quality within set limits in the face of complex ongoing perturbing forces, where an ultimate, final “achieve and done” cannot apply.)

Fully specified systems goals can in fact often be encapsulated within endpoint determinable algorithms that meet the definitional requirements of that axiomatic assumption. Open-ended goals as discussed here would arguably not always fit any single algorithm in that way. There, ongoing benchmarking and performance metrics that fit into agreed to parameters might provide a best alternative to any final goals specification as presumed there.

In a natural conversation, this might mean for example, people engaged in a conversation not finding themselves confused as to how their chat seems to have become derailed from a loss of focus on what is actually supposedly being discussed. But even that type and level of understanding can be complex, as perhaps illustrated with my “shopping plus” conversational example of above.

So I will turn to consider middle ground, partly specified systems goals and agents that might carry out tasks that would realize them in my next installment here. And after completing that line of discussion, at least for purposes of this series, I will turn back to reconsider open-ended goals and their agents again, and more from a perspective of general principles.

Meanwhile, you can find this and related postings and series at Ubiquitous Computing and Communications – everywhere all the time and its Page 2 and Page 3 continuations. And you can also find a link to this posting, appended to the end of Section I of Reexamining the Fundamentals as a supplemental entry there.

Moore’s law, software design lock-in, and the constraints faced when evolving artificial intelligence 7

This is my 7th posting to a short series on the growth potential and constraints inherent in innovation, as realized as a practical matter (see Reexamining the Fundamentals 2, Section VIII for Parts 1-6.) And this is also my fourth posting to this series, to explicitly discuss emerging and still forming artificial intelligence technologies as they are and will be impacted upon by software lock-in and its imperatives, and by shared but more arbitrarily determined constraints such as Moore’s law (see Part 4, Part 5 and Part 6.)

I focused in Part 6 of this narrative, on a briefly stated succession of possible development possibilities that all relate to how an overall next generation internet will take shape, that is largely and even primarily driven at least for a significant proportion of functional activity carried out in it, by artificial intelligence agents and devices: an increasingly largely internet of things and of smart artifactual agents that act among them. And I began that with a continuation of a line of discussion that I began in earlier installments to this series, centering on four possible development scenarios as initially offered by David Rose in his book:

• Rose, D. (2014) Enchanted Objects: design, human desire and the internet of things. Scribner.

I added something of a fifth such scenario, or rather a caveat-based acknowledgment of the unexpected in how this type of overall development will take shape, in Part 6. And I ended that posting with a somewhat cryptic anticipatory note as to what I would offer here in continuation of its line of discussion, which I repeat now for smoother continuity of narrative:

• I am going to continue this discussion in a next series installment, where I will at least selectively examine some of the core issues that I have been addressing up to here in greater detail, and how their realized implementations might be shaped into our day-to-day reality. And in anticipation of that line of discussion to come, I will do so from a perspective of considering how essentially all of the functionally significant elements to any such system and at all levels of organizational resolution that would arise in it, are rapidly coevolving and taking form, and both in their own immediately connected-in contexts and in any realistic larger overall rapidly emerging connections-defined context too. And this will of necessity bring me back to reconsider some of the first issues that I raised in this series too.

The core issues that I would continue addressing here as follow-through from that installment, fall into two categories. I am going to start this posting by adding another scenario to the set that I began presenting here, as initially set forth by Rose with his first four. And I will use that new scenario to make note of and explicitly consider an unstated assumption that was built into all of the other artificial intelligence proliferation and interconnection scenarios that I have offered here so far. And then, and with that next step alternative in mind, I will reconsider some of the more general issues that I raised in Part 6, further developing them too.

I begin all of this with a systems development scenario that I would refer to as the piecewise distributed model.

• The piecewise distributed model for how artificial intelligence might arise as a significant factor in the overall connectiverse that I wrote of in Part 6 is based on current understanding of how human intelligence arises in the brain as an emergent property, or rather set of them, from the combined and coordinated activity of simpler components that individually do not display anything like intelligence per se, and certainly not artificial general intelligence.

It is all about how neural systems-based intelligence arises from lower level, unintelligent components in the brain and how that might be mimicked, or recapitulated if you will through structurally and functionally analogous systems and their interconnections, in artifactual systems. And I begin to more fully characterize this possibility by more explicitly considering scale, and to be more precise the scale of range of reach for the simpler components that might be brought into such higher level functioning totalities. And I begin that with a simple if perhaps somewhat odd sounding question:

• What is the effective functional radius of the human brain given the processing complexities and the numbers and distributions of nodes in the brain that are brought into play in carrying out a “higher level” brain activity, the speed of neural signal transmission in that brain as a parametric value in calculations here, and an at least order of magnitude assumption as to the timing latency to conscious awareness of a solution arrived at for a brain activity task at hand, from its initiation to its conclusion?

And with that as a baseline, I will consider the online and connected alternative that a piecewise distributed model artificial general intelligence, or even just a higher level but still somewhat specialized artificial intelligence would have to function within.

Let’s begin this side by side comparative analysis with consideration of what might be considered a normative adult human brain, and with a readily and replicably arrived at benchmark number: myelinated neurons as found in the brain send signals at a rate of approximately 120 meters per second, where one meter is equal to approximately three and a quarter feet in distance. And for simplicity’s sake I will simply benchmark the latency from the starting point of a cognitively complex task to its consciously perceived completion at one tenth of a second. This would yield an effective functional radius of that brain at 12 meters or 40 feet, or less – assuming as a functionally simplest extreme case for that outer range value that the only activity required to carry out this task was the simple uninterrupted transmission of a neural impulse signal along a myelinated neuron for some minimal period of time to achieve “task completion.”

An actual human brain is of course a lot more compact than that, and a lot more structurally complex too, with specialized functional nodes and complex arrays of parallel processor organized structurally and functionally duplicated elements in them. And that structural and functional complexity, and the timing needed to access stored information from and add new information back into memory again as part of that task activity, slows actual processing down. An average adult human brain is some 15 centimeters long, or six inches front to back so using that as an outside-value metric and a radius as based on it of some three inches, structural and functional complexities in the brain that would be called upon to carry out that tenth of a second task, would effectively reduce its effective functional radius some 120-fold from the speedy transmission-only outer value that I began this brief analysis with.

Think of that as a speed and efficiency tradeoff reduction imposed on the human brain by its basic structural and functional architecture and by the nature and functional parameters of its component parts, on the overall possible maximum rate of activity, at least for tasks performed that would fit the overall scale and complexity of my tenth of a second benchmark example. Now let’s consider the more artifactual overall example of computer and network technology as would enter into my above-cited piecewise distributed model scenario, or in fact into essentially any network distributed alternative to it. And I begin that by noting that the speed of light in a vacuum is approximately 300 million meters per second, and that electrons can travel along a pure copper wire at up to approximately 99% of that value.

I will assume for purposes of this discussion that photons in wireless networked and fiber optic connected aspects of such a system, and the electrons that convey information through their flow distributions in more strictly electronic components of these systems all travel on average at roughly that same round number maximum speed, as any discrepancy from it in what is actually achieved would be immaterial for purposes of this discussion, given my rounding off and other approximations as resorted to here. Then, using the task timing parameter of my above-offered brain functioning analysis, as sped up to one tenth of a millisecond for an electronic computer context, an outer limit transmission-only value for this system and its physical dimensions would suggest a maximum radius of some 30,000 kilometers, encompassing all of the Earth and all of near-Earth orbit space and more. There, in counterpart to my simplest case neural signal transmission processing as a means of carrying out the above brain task, I assume here that its artificial intelligence counterpart might be completed simply by the transmission of a single pulse of electrons or photons and without any processing step delays required.

Individual neurons can fire up to some 200 times per second, depending on the type of function carried out, and an average neuron in the brain connects to what is on the order of 1000 other neurons through complex dendritic branching and the synaptic connections they lead to, and with some neurons connecting to as many as 10,000 others and more. I assume that artificial networks can grow to that level of interconnected connectivity and more too, and with levels of involved nodal connectivity brought into any potentially emergent artificial intelligence activity that might arise in such a system, that matches and exceeds that of the brain for its complexity there too. That at least, is likely to prove true for any of what with time would become the all but myriad number of organizing and managing nodes, that would arise in at least functionally defined areas of this overall system and that would explicitly take on middle and higher level SCADA -like command and control roles there.

This would slow down the actual signal transmission rate achievable, and reduce the maximum physical size of the connected network space involved here too, though probably not as severely as observed in the brain. There, even today’s low cost readily available laptop computers can now carry out on the order of a billion operations per second and that number continues to grow as Moore’s law continues to hold forth. So if we assume “slow” and lower priority tasks as well as more normatively faster ones for the artificial intelligence network systems that I write of here, it is hard to imagine restrictions that might realistically arise that would effectively limit such systems to volumes of space smaller than the Earth as a whole, and certainly when of-necessity higher speed functions and activities could be carried out by much more local subsystems and closer to where their outputs would be needed.

And to increase the expected efficiencies of these systems, brain as well as artificial network in nature, effectively re-expanding their effective functional radii again, I repeat and invoke a term and a design approach that I used in passing above: parallel processing. That, and inclusion of subtask performing specialized nodes, are where effectively breaking up a complex task into smaller, faster-to-complete subtasks, whose individual outputs can be combined as a completed overall solution or resolution, can speed up overall task completion by orders of timing efficiency and for many types of tasks, allowing more of them to be carried out within any given nominally expected benchmark time for expected “single” task completions. This of course also allows for faster completion of larger tasks within that type of performance measuring timeframe window too.

• What I have done here at least in significant part, is to lay out an overall maximum connected systems reach that could be applied to the completion of tasks at hand, and in either a human brain or an artificial intelligence-including network. And the limitations of accessible volume of space there, correspondingly sets an outer limit to the maximum number of functionally connected nodes that might be available there, given that they all of necessity have space filling volumes that are greater than zero.
• When you factor in the average maximum processing speed of any information processing nodes or elements included there, this in turn sets an overall maximum, outer limit value to the number of processing steps that could be applied in such a system, to complete a task of any given time-requiring duration, within such a physical volume of activity.

What are the general principles beyond that set of observations that I would return to here, given this sixth scenario? I begin addressing that question by noting a basic assumption that is built into the first five scenarios as offered in this series, and certainly into the first four of them: that artificial intelligence per se reside as a significant whole in specific individual nodes. I fully expect that this will prove true in a wide range of realized contexts as that possibility is already becoming a part of our basic reality now, with the emergence and proliferation of artificial specialized intelligence agents. But as this posting’s sixth scenario points out, that assumption is not the only one that might be realized. And in fact it will probably only account for part of what will to come to be seen as artificial intelligence as it arises in these overall systems.

The second additional assumption that I would note here is that of scale and complexity, and how fundamentally different types of implementation solutions might arise, and might even be possible, strictly because of how they can be made to work with overall physical systems limitations such as the fixed and finite speed of light.

Looking beyond my simplified examples as outlined here: brain-based and artificial alike, what is the maximum effective radius of a wired AI network, that would as a distributed system come to display true artificial general intelligence? How big a space would have to be tapped into for its included nodes to match a presumed benchmark human brain performance for threshold to cognitive awareness and functionality? And how big a volume of functionally connected nodal elements could be brought to bear for this? Those are open questions as are their corresponding scale parameter questions as to “natural” general intelligence per se. I would end this posting by simply noting that disruptively novel new technologies and technology implementations that significantly advance the development of artificial intelligence per se, and the development of artificial general intelligence in particular, are likely to both improve the quality and functionality of individual nodes involved and regardless of which overall development scenarios are followed, and their capacity to synergistically network together.

I am going to continue this discussion in a next series installment where I will step back from considering specific implementation option scenarios, to consider overall artificial intelligence systems as a whole. I began addressing that higher level perspective and its issues here, when using the scenario offered in this posting to discuss overall available resource limitations that might be brought to bear on a networked task, within given time-to-completion restrictions. But that is only one way to parameterize this type of challenge, and in ways that might become technologically locked in and limited from that, or allowed to remain more open to novelty – at least in principle.

Meanwhile, you can find this and related material at Ubiquitous Computing and Communications – everywhere all the time 3 and also see Page 1 and Page 2 of that directory. And I also include this in my Reexamining the Fundamentals 2 directory as topics Section VIII. And also see its Page 1.

Addendum note: The above presumptive end note added at the formal conclusion of this posting aside, I actually conclude this installment with a brief update to one of the evolutionary development-oriented examples that I in effect began this series with. I wrote in Part 2 of this series, of a biological evolution example of what can be considered an early technology lock-in, or rather a naturally occurring analog of one: of an ancient biochemical pathway that is found in all cellular life on this planet: the pentose shunt.

I add a still more ancient biological systems lock-in example here that in fact had its origins in the very start of life itself as we know it, on this planet. And for purposes of this example, it does not even matter whether the earliest genetic material employed in the earliest life forms was DNA or RNA in nature for how it stored and transmitted genetic information from generation to generation and for how it used such information in its life functions within individual organisms. This is an example that would effectively predate that overall nucleic acid distinction as it involves the basic, original determination of precisely which basic building blocks would go into the construction and information carrying capabilities of either of them.

All living organisms on Earth, with a few viral exceptions employ DNA as their basic archival genetic material, and use RNA as an intermediary in accessing and making use of the information so stored there. Those viruses use RNA for their own archival genetic information storage, and the DNA replicating and RNA fabrication machinery of the host cells they live in to reproduce. And the genetic information included in these systems, and certainly at a DNA level is all encoded in patterns of molecules called nucleotides that are linearly laid out in the DNA design. Life on Earth uses combinations of four possible nucleotides for this coding and decoding: adenine (A), thymine (T), guanine (G) and cytosine (C). And it was presumed at least initially that the specific chemistry of these four possibilities made them somehow uniquely suited to this task.

More recently it has been found that there are other possibilities that can be synthesized and inserted into DNA-like molecules, with the same basic structure and chemistry, that can also carry and convey this type of genetic information and stably, reliably so (see for example:

Hachimoji DNA and RNA: a genetic system with eight building blocks.)

And it is already clear that this only indicates a small subset of the information coding possibilities that might have arisen as alternatives to the A/T/G/C genetic coding became locked-in, in practice in life on Earth.

If I could draw one relevant conclusion to this still unfolding story that I would share here, it is that if you want to find technology lock-ins, or their naturally occurring counterparts, look to your most closely and automatically held developmental assumptions, and certainly when you cannot rigorously justify them from first principles. Then question the scope of relevance and generality of your first principles there, for hidden assumptions that they carry within them.

Some thoughts concerning a general theory of business 29: a second round discussion of general theories as such, 4

Posted in blogs and marketing, book recommendations, reexamining the fundamentals by Timothy Platt on June 11, 2019

This is my 29th installment to a series on general theories of business, and on what general theory means as a matter of underlying principle and in this specific context (see Reexamining the Fundamentals directory, Section VI for Parts 1-25 and its Page 2 continuation, Section IX for Parts 26-28.)

I began this series in its Parts 1-8 with an initial orienting discussion of general theories per se, with an initial analysis of compendium model theories and of axiomatically grounded general theories as a conceptual starting point for what would follow. And I then turned from that, in Parts 9-25 to at least begin to outline a lower-level, more reductionistic approach to businesses and to thinking about them, that is based on interpersonal interactions. Then I began a second round, next step discussion of general theories per se in Parts 26-28 of this, building upon my initial discussion of general theories per se, this time focusing on axiomatic systems and on axioms per se and the presumptions that they are built upon.

More specifically, I have used the last three postings to that progression to at least begin a more detailed analysis of axioms as assumed and assumable statements of underlying fact, and of general bodies of theory that are grounded in them, dividing those theories categorically into two basic types:

• Entirely abstract axiomatic bodies of theory that are grounded entirely upon sets of a priori presumed and selected axioms. These theories are entirely encompassed by sets of presumed fundamental truths: sets of axiomatic assumptions, as combined with complex assemblies of theorems and related consequential statements (lemmas, etc) that can be derived from them, as based upon their own collective internal logic. Think of these as axiomatically closed bodies of theory.
• And theory specifying systems that are axiomatically grounded as above, with at least some a priori assumptions built into them, but that are also at least as significantly grounded in outside-sourced information too, such as empirically measured findings as would be brought in as observational or experimental data. Think of these as axiomatically open bodies of theory.

I focused on issues of completeness and consistency in these types of theory grounding systems in Part 28 and briefly outlined there, how the first of those two categorical types of theory cannot be proven either fully complete or fully consistent, if they can be expressed in enumerable form of a type consistent with, and as such including the axiomatic underpinnings of arithmetic: the most basic of all areas of mathematics, as formally axiomatically laid out by Whitehead and Russell in their seminal work: Principia Mathematica.

I also raised and left open the possibility that the outside validation provided in axiomatically open bodies of theory, as identified above, might afford alternative mechanisms for de facto validation of completeness, or at least consistency in them, where Kurt Gödel’s findings as briefly discussed in Part 28, would preclude such determination of completeness and consistency for any arithmetically enumerable axiomatically closed bodies of theory.

That point of conjecture began a discussion of the first of a set of three basic, and I have to add essential topics points that would have to be addressed in establishing any attempted-comprehensive bodies of theory: the dual challenges of scope and applicability of completeness and consistency per se as organizing goals, and certainly as they might be considered in the contexts of more general theories. And that has left these two here-repeated follow-up points for consideration:

• How would new axioms be added into an already developing body of theory, and how and when would old ones be reframed, generalized, limited for their expected validity and made into special case rules as a result, or be entirely discarded as organizing principles there per se.
• Then after addressing that set of issues I said that I will turn to consider issues of scope expansion for the set of axioms assumed in a given theory-based system, and with a goal of more fully analytically discussing optimization for the set of axioms presumed, and what that even means.

My goal for this series installment is to at least begin to address the first of those two points and its issues, adding to my already ongoing discussion of completeness and consistency in complex axiomatic theories while doing so. And I begin by more directly and explicitly considering the nature of outside-sourced, a priori empirically or otherwise determined observations and the data that they would generate, that would be processed into knowledge through logic-based axiomatic reasoning.

Here, and to explicitly note what might be an obvious point of observation on the part of readers, I would as a matter of consistency represent the proven lemmas and theorems of a closed body of theory such as a body of mathematical theory, as proven and validated knowledge as based on that theory. And I correspondingly represent open question still-unproven or unrefuted theoretical conjectures as they arise and are proposed in those bodies of theory, as potentially validatable knowledge in those systems. And having noted that point of assumption (presumption?), I turn to consider open systems as for example would be found in theories of science or of business, in what follows.

• Assigned values and explicitly defined parameters, as arise in closed systems such as mathematical theories with their defined variables and other constructs, can be assumed to represent absolutely accurate input data. And that, at least as a matter of meta-analysis, even applies when such data is explicitly offered and processed through axiomatic mechanisms as being approximate in nature and variable in range; approximate and variable are themselves explicitly defined, or at least definable in such systems applications, formally and specifically providing precise constraints on the data that they would organize, even then.
• But it can be taken as an essentially immutable axiomatic principle: one that cannot be violated in practice, that outside sourced data that would feed into and support an axiomatically open body of theory, is always going to be approximate for how it is measured and recorded for inclusion and use there, and even when that data can be formally defined and measured without any possible subjective influence – when it can be identified and defined and measured in as completely objective a manner as possible and free of any bias that might arise depending on who observes and measures it.

Can an axiomatically open body of theory somehow be provably complete or even just consistent for that matter, due to the balancing and validating inclusion of outside frame of reference-creating data such as experientially derived empirical observations? That question can be seen as raising an interesting at least-potential conundrum and certainly if a basic axiom of the physical sciences that I cited and made note of in Part 28 is (axiomatically) assumed true:

• Empirically grounded reality is consistent across time and space.

That at least in principle, after all, raises what amounts to an immovable object versus an unyieldable force type of challenge. But as soon as the data that is actually measured, as based on this empirically grounded reality, takes on what amounts to a built in and unavoidable error factor, I would argue that any possible outside-validated completeness or consistency becomes moot at the very least and certainly for any axiomatically open system of theory that might be contemplated or pursued here.

• This means that when I write of selecting, framing and characterizing and using axioms and potential axioms in such a system, I write of bodies of theory that are of necessity always going to be works in progress: incomplete and potentially inconsistent and certainly as new types of incoming data are discovered and brought into them, and as better and more accurate ways to measure the data that is included are used.

Let me take that point of conjecture out of the abstract by citing a specific source of examples that are literally as solidly established as our more inclusive and more fully tested general physical theories of today. And I begin this with Newtonian physics as it was developed at a time when experimental observation was limited for the range of phenomena observed and in the levels of experimental accuracy attainable for what was observed and measured, so as to make it impossible to empirically record the types of deviation from expected sightings that would call for new and more inclusive theories, with new and altered underlying axiomatic assumptions, as subsequently called for in the special theory of relativity as found and developed by Einstein and others. Newtonian physics neither calls for nor accommodates anything like the axiomatic assumptions of the special theory of relativity, holding for example that the speed of light is constant in all frames of reference. More accurate measurements as taken over wider ranges of experimental examination of observable phenomena forced change to the basic underlying axiomatic assumptions of Newton (e.g. his laws of motion.) And further expansion of the range of phenomena studied and the level of accuracy in which data is collected from all of this, might very well lead to the validation and acceptance of still more widely inclusive basic physical theories, and with any changes in what they would axiomatically presume in their foundations included there. (Discussion of alternative string theory models of reality among other possibilities, come to mind here, where experimental observational limitations of the types that I write of here, are such as to preclude any real culling and validating there, to arrive at a best possible descriptive and predictive model theory.)

At this point I would note that I tossed a very important set of issues into the above text in passing, and without further comment, leaving it hanging over all that has followed it up to here: the issues of subjectivity.

Data that is developed and tested for how it might validate or disprove proposed physical theory might be presumed to be objective, as a matter of principle. Or alternatively and as a matter of practice, it might be presumed possible to obtain such data that is arbitrarily close to being fully free from systematic bias, as based on who is observing and what they think about the meaning of the data collected. And the requirement that experimental findings be independently replicated by different researchers in different labs and with different equipment, and certainly where findings are groundbreaking and unexpected, serves to support that axiomatic assumption as being basically reliable. But it is not as easy or as conclusively presumable to assume that type of objectivity for general theories that of necessity have to include within them, individual human understand and reasoning with all of the additional and largely unstated axiomatic presumptions that this brings with it, as exemplified by a general theory of business.

That simply adds whole new layers of reason to any argument against presumable completeness or consistency in such a theory and its axiomatic foundations. And once again, this leaves us with the issues of such theories always being works in progress, subject to expansion and to change in general.

And this brings me specifically and directly to the above-stated topics point that I would address here in this brief note of a posting: the determination of which possible axioms to include and build from in these systems. And that, finally, brings me to the issues and approaches that are raised in a reference work that I have been citing in anticipation of this discussion thread for a while now in this series, and an approach to the foundation of mathematics and its metamathematical theories that this and similar works seek to clarify if not codify:

• Stillwell, J. (2018) Reverse Mathematics: proofs from the inside out. Princeton University Press.)

I am going to more fully and specifically address that reference and its basic underlying conceptual principles in a next series installment. But in anticipation of doing so, I end this posting with a basic organizing point of reference that I will build from there:

• The more traditional approach to the development and elaboration of mathematical theory, and going back at least as far as the birth of Euclidean geometry, was one of developing a set of axioms that would be presumed as if absolute truths, and then developing emergent lemmas and theories from them.
• Reverse mathematics is so named because it literally reverses that, starting with theories to be proven and then asking what are the minimal sets of axioms that would be needed in order to prove them.

My goal for the next installment to this series is to at least begin to consider both axiomatically closed and axiomatically open theory systems in light of these two alternative metatheory approaches. And in anticipation of that narrative line to come, this will mean reconsidering compendium models and how they might arise as need for new axiomatic frameworks of understanding arise, and as established ones become challenged.

Meanwhile, you can find this and related material about what I am attempting to do here at About this Blog and at Blogs and Marketing. And I include this series in my Reexamining the Fundamentals directory and its Page 2 continuation, as topics Sections VI and IX there.

Reconsidering Information Systems Infrastructure 9

Posted in business and convergent technologies, reexamining the fundamentals by Timothy Platt on April 18, 2019

This is the 9th posting to a series that I am developing, with a goal of analyzing and discussing how artificial intelligence and the emergence of artificial intelligent agents will transform the electronic and online-enabled information management systems that we have and use. See Ubiquitous Computing and Communications – everywhere all the time 2 and its Page 3 continuation, postings 374 and loosely following for Parts 1-8. And also see two benchmark postings that I initially wrote just over six years apart but that together provided much of the specific impetus for my writing this series: Assumption 6 – The fallacy of the Singularity and the Fallacy of Simple Linear Progression – finding a middle ground and a late 2017 follow-up to that posting.

I stated towards the beginning of Part 8 of this series that I have been developing a foundation in it for thinking about neural networks and their use in artificial intelligence agents. And that has in fact been one of my primary goals here, as a means of exploring and analyzing more general issues regarding artificial agents and their relationships to humans and to each other, and particularly in a communications and an information-centric context and when artificial agents can change and adapt. Then at the end of Part 8, I said that I would at least begin to specifically discuss neural network architectures per se and systems built according to them in this complex context, starting here.

The key area of consideration that I would at least begin to address in this posting as a part of that narrative, is that of flexibility in range and scope for adaptive ontological change, where artificial intelligence agents would need that if they are to self-evolve new types of, or at least expanded levels of functional capabilities for more fully realizing the overall functional goals that they would carry out. I have been discussing natural conversation as a working artificial general intelligence-validating example of this type of goal-directed activity in this series. And I have raised the issues and challenges of chess playing excellence in Part 8, with its race to create the best chess player agent in the world as an ongoing computational performance benchmark-setting goal too, and with an ongoing goal beyond that of continued improvement in chess playing performance per se. See in that regard, my Part 8 discussion of the software-based AlphaZero artificial intelligence agent: the best chess player on the planet as of this writing.

Turning to explicitly consider neural networks and their emerging role in all of this, they are more generically wired systems when considered at a hardware level, that can flexibly adapt themselves on a task performance level basis, for which specific possible circuit paths are actually developed and used within them, and for which of them are downgraded and in effect functionally removed too. These are self-learning systems that in effect rewire themselves to more effectively carry out data processing flows that can more effectively carry out their targeted functions, developing and improving circuit paths that work for them and culling out and eliminating ones that do not – and at a software level and de-facto at a hardware level too.

While this suggestion is cartoonish in nature, think of these systems as blurring the lines between hardware and software, and think of them as being at least analogous to self-directed and self-evolving software-based hardware emulators in the process, where at any given point in time and stage in their ongoing development, they emulate through the specific pattern of preferred hardware circuitry used and their specific software in place, an up to that point most optimized “standard” hardware and software computer for carrying out their assigned task-oriented functions. It is just that neural networks can continue to change and evolve, testing and refining themselves, instead of being locked into a single fixed overall solution as would be the case in a “standard” design more conventional computer, and certainly when it is run between software upgrades.

• I wrote in Part 8 of human-directed change in artificial agent design and both for overall systems architecture and for component-and-subsystem, by component-and-subsystem scaling. A standard, fixed design paradigmatic approach as found in more conventional computers as just noted here, fits into and fundamentally supports the systems evolution of fixed, standard systems and in its pure form cannot in general self-change either ontologically or evolutionarily.
• And I wrote in Part 8 of self-directed, emergent capabilities in artificial intelligence agents, citing how they might arise as preadapted capabilities that have arisen without regard to a particular task or functional goal now faced, but that might be directly usable for such a functional requirement now – or that might be readily adapted for such use with more targeted adjustment of the type noted here. And I note here that this approach really only becomes fundamentally possible in a neural network or similar, self-directed ontological development context, with that taking place within the hardware and software system under consideration.

Exaptation (pre-adaptation) is an evolutionary development option that would specifically arise in neural network or similarly self-changing and self-learning systems. And with that noted I invoke a term that has been running through my mind as I write this, and that I have been directing this discussion towards reconsidering here: an old software development term that in a strictly-human programmer context is something of a pejorative: spaghetti code. See Part 6 of this series where I wrote about this phenomenon in terms of a loss of comprehensibility as to the logic flow of whatever underlying algorithm a given computer program is actually running – as opposed to the algorithm that the programmer intended to run in that program.

I reconsider spaghetti code and its basic form here for a second reason, this time positing it as an alternative to lean code that would seek to carry out specific programming tasks in very specific ways and as quickly as possible and as efficiently as possible, as far as specific hardware architecture, system speed as measured by clock signals per unit time, and other resource usage requirements and metrics are concerned. Spaghetti code and its similarly more loosely structured counterparts, are what you should expect and they are what you get when you set up and let loose self-learning neural network-based or similar artificial agent systems and let them change and adapt without outside guidance, or interference if you will.

• These systems do not specifically, systematically seek to ontologically develop as lean systems as that would most likely mean their locking in less than optimal hardware-used and software-executed solutions than they could otherwise achieve.
• They self-evolve with slack and laxity in their systems, while iteratively developing towards next step improvements in what they are working on now, and in ways that can create pre-adaptation opportunities – and particularly as these systems become larger and more complex and as the tasks that they would carry out and optimize towards become more complex and even open-endedly so (as emerges when addressing problems such as chess, but that would come fully into its own for tasks such as development of a natural conversation capability.)

If more normative step-by-step ontological development of incremental performance improvements in task completion, can be compared to more gradual evolutionary change within some predictable-for-outline pattern, then the type of slack allowance with its capacity for creating fertile ground for possible pre-adaptation opportunity that I write of here, can perhaps best be compared to disruptive change or at least opportunity for it – at least for the visible outcome consequences observed as a pre-adapted capability that has not proven particularly relevant up to now is converted from a possibility to a realized current functionally significant actuality.

And with this noted, I raise a tripartite point of distinction, that I will at least begin to flesh out and discuss as I continue developing this series:

• Fully specified systems goals (e.g. chess rules as touched upon in Part 8 for an at least somewhat complex example, but with fully specified rules defining a win and a loss, etc. for it.),
• Open-ended systems goals (e.g. natural conversational ability as more widely discussed in this series and certainly in its more recent installments with its lack of corresponding fully characterized performance end points or similar parameter-defined success constraints), and
• Partly specified systems goals (as in self-driving cars where they can be programmed with the legal rules of the road, but not with a correspondingly detailed algorithmically definable understanding of how real people in their vicinity actually drive and sometimes in spite of those rules: driving according to or contrary to the traffic laws in place.)

I am going to discuss partly specified systems goals and agents, and overall systems that would include them and that would seek to carry out those tasks in my next series installment. And I will at least start that discussion with self-driving cars as a source of working examples and as an artificial intelligence agent goal that is still in the process of being realized, as of this writing. In anticipation of that discussion to come, this is where stochastic modeling enters this narrative.

Meanwhile, you can find this and related postings and series at Ubiquitous Computing and Communications – everywhere all the time and its Page 2 and Page 3 continuations. And you can also find a link to this posting, appended to the end of Section I of Reexamining the Fundamentals as a supplemental entry there.

Moore’s law, software design lock-in, and the constraints faced when evolving artificial intelligence 6

This is my 6th posting to a short series on the growth potential and constraints inherent in innovation, as realized as a practical matter (see Reexamining the Fundamentals 2, Section VIII for Parts 1-5.) And this is also my third posting to this series, to explicitly discuss emerging and still forming artificial intelligence technologies as they are and will be impacted upon by software lock-in and its imperatives, and by shared but more arbitrarily determined constraints such as Moore’s law (see Part 4 and Part 5.)

I began discussing overall patterns of technology implementation in an advancing artificial intelligence agent context in Part 4, where I cited a set of possible scenarios that might significantly arise for that in the coming decades, for how artificial intelligence capabilities in general might proliferate, as originally offered in:

• Rose, D. (2014) Enchanted Objects: design, human desire and the internet of things. Scribner.

And to briefly repeat from what I offered there in this context, for smoother continuity of narrative, I cited and began discussing those four possible scenarios (using Rose’s names for them) as:

1. Terminal world, in which most or even essentially all human/artificial intelligence agent interactions take place through the “glass slabs and painted pixels” of smart phone and other separating, boundary maintaining interfaces.
2. Prosthetics, in which a major thrust of this technology development is predicated upon human improvement, with the internalization of these new technology capabilities within us.
3. Animism, and the emergence of artificial intelligence ubiquity through the development and distribution of seemingly endless numbers of smart robotic and artificially intelligence-enabled nodes.
4. And Enchanted Objects, in which the once routine and mundane of our everyday life becomes imbued with amazing new capabilities. Here, unlike the immediately preceding scenario, focus of attention and of action takes place in specific devices and their circumstances that individually arise to prominence of attention and for many if not most people, where the real impact of the animism scenario would be found in a mass effect gestalt arising from what are collectively impactful, but individually mostly unnoticed smart(er) parts.

I at least briefly argued the case there for assuming that we will in fact come to see some combination of these scenarios arise in actual fact, as each at least contextually comes to the top as a best approach for at least some set of recurring implementation contexts. And I effectively begin this posting by challenging a basic assumption that I built into that assessment:

• The tacit and all but axiomatic assumption that enters into a great deal of the discussion and analysis of artificial intelligence, and of most other still-emerging technologies as well,
• That while disruptively novel can and does occur as a matter of principle, it is unlikely to happen and certainly right now in any given technology development context that is actively currently being pursued, along some apparently fruitful current developmental path.

All four of the above repeated and restated scenario options have their roots in our here and now and its more readily predictable linear development moving forward. It is of the nature of disruptively new and novel that it comes without noticeable warning and precisely in ways that would be unexpected. The truly disruptively novel innovations that arise, come as if lightning out of a clear blue sky, and they blindside everyone affected by them for their unexpected suddenness and for their emerging impact, as they begin to gain traction in implementation and use. What I am leading up to here is very simple, at least in principle, even if the precise nature of the disruptively new and novel limits our ability to foresee in advance the details of what is to come of that:

• While all of the first four development and innovation scenarios as repeated above, will almost certainly come to play at least something of a role in our strongly artificially intelligence-shaped world to come, we also have to expect all of this to develop and play out in disruptively new ways too, and both as sources of specific contextually relevant solutions for how best to implement this new technology, and for how all of these more context-specific solutions are in effect going to be glued together to form overall, organized systems.

I would specifically stress the two sides to that more generally and open-endedly stated fifth option here, that I just touched upon in passing in the above bullet point. I write here of more locally, contextually specific implementation solutions, here for how artificial intelligence will connect to the human experience. But I also write of the possibility that overarching connectivity frameworks that all more local context solutions would fit into, are likely going to emerge as disruptively new too. And with that noted as a general prediction as to what is likely to come, I turn here to at least consider some of the how and why details of that, that would lead me to make this prediction in the first place.

Let’s start by rethinking some of the implications of a point that I made in Part 4 of this series when first addressing the issues of artificial intelligence, and of artificial intelligence agents per se. We do not even know what artificial general intelligence means, at least at anything like an implementation-capable level of understanding. We do not in fact even know what general intelligence is per se and even just in a more strictly human context, at least where that would mean our knowing what it is and how it arises in anything like a mechanistic sense. And in fact we are, in a fundamental sense, still learning what even just artificial specialized and single task intelligence is and how that might best be implemented.

All of this still-present, significantly impactful lack of knowledge and insight raises the likelihood that all that we know and think that we know here, is going to be upended by the novel, the unexpected and the disruptively so – and probably when we least expect that.

And with this stated, I raise and challenge a second basic assumption that by now should be more generally disavowed, but that still hangs on. In a few short decades from now, for all of the billions of human online nodes: human-operated devices and virtual devices that we connect online through, that will collectively only account for a small fraction of the overall online connected universe: the overall connectiverse that we are increasingly living in. All of the rest: all of the soon to be vast majority of the rest of this will all be device-to-device in nature, and fit into what we now refer to as the internet of things. And pertinently to this discussion that means that a vast majority of the connectedness that is touched upon in the above four (five?) scenarios, is not going to be about human connectedness per se at all, except perhaps indirectly. And this very specifically leads me back to what I view as the real imperative of the fifth scenario: the disruptively new and novel pattern of overall connectivity that I made note of above, and certainly when considering the glue that binds our emerging overall systems together with all of the overarching organizational implications that that option and possibility raises.

Ultimately, what works and both at a more needs-specific contextual level there, and at an overall systems connecting and interconnecting level, is going to be about optimization, with aesthetics and human tastes critically important and certainly for technology solution acceptance – for human-to-human and human-to-artificial intelligence agent contexts. But in a strictly, or even just primarily artificial intelligence agent-to-artificial intelligence agent and dumb device-to-artificial intelligence agent context, efficiency measures will dominate that are not necessarily human usage-centric. And they will shape and drive any evolutionary trends that arise as these overall systems continue to advance and evolve (see Part 3 and Part 5 for their discussions of adaptive peak models and related evolutionary trend describing conceptual tools, as they would apply to this type of context.)

If I were to propose one likely detail that I fully expect to arise in any such overall organizing, disruptively novel interconnection scenario, it is that the nuts and bolts details of the still just emerging overall networking system that I write of here, will most likely reside at and function at a level that is not explicitly visible and certainly to human participants in it, unless directly connected into and in any of the contextual scenario solutions that arise and that are developed and built into it: human-to-human, human-to-device or intelligent agent, or device or agent-to-device or agent. And this overarching technology, optimized in large part by the numerically compelling pressures of device or agent-to-device or agent connectivity needs, will probably take the form of a set of universally accepted and adhered to connectivity protocols: rules of the road that are not going to be all that human-centric.

I am going to continue this discussion in a next series installment, where I will at least selectively examine some of the core issues that I have been addressing up to here in greater detail, and how their realized implementations might be shaped into our day-to-day reality. And in anticipation of that line of discussion to come, I will do so from a perspective of considering how essentially all of the functionally significant elements to any such system and at all levels of organizational resolution that would arise in it, are rapidly coevolving and taking form, and both in their own immediately connected-in contexts and in any realistic larger overall rapidly emerging connections-defined context too. And this will of necessity bring me back to reconsider some of the first issues that I raised in this series too.

Meanwhile, you can find this and related material at Ubiquitous Computing and Communications – everywhere all the time 3 and also see Page 1 and Page 2 of that directory. And I also include this in my Reexamining the Fundamentals 2 directory as topics Section VIII. And also see its Page 1.

Some thoughts concerning a general theory of business 28: a second round discussion of general theories as such, 3

Posted in blogs and marketing, book recommendations, reexamining the fundamentals by Timothy Platt on April 6, 2019

This is my 28th installment to a series on general theories of business, and on what general theory means as a matter of underlying principle and in this specific context (see Reexamining the Fundamentals directory, Section VI for Parts 1-25 and its Page 2 continuation, Section IX for Parts 26 and 27.)

I began this series in its Parts 1-8 with an initial orienting discussion of general theories per se, with an initial analysis of compendium model theories and of axiomatically grounded general theories as a conceptual starting point for what would follow. And I then turned from that, in Parts 9-25 to at least begin to outline a lower-level, more reductionistic approach to businesses and to thinking about them, that is based on interpersonal interactions.

Then I began a second round, next step discussion of general theories per se in Part 26 and Part 27, to add to the foundation that I have been discussing theories of business in terms of, and as a continuation of the Parts 1-8 narrative that I began all of this with. More specifically, I used those two postings to begin a more detailed analysis of axioms per se, and of general bodies of theory that are grounded in them, dividing those theories categorically into two basic types:

• Entirely abstract axiomatic bodies of theory that are grounded entirely upon sets of a priori presumed and selected axioms. These theories are entirely comprised of their particular sets of those axiomatic assumptions as combined with complex assemblies of theorems and related consequential statements (lemmas, etc) that can be derived from them, as based upon their own collective internal logic. Think of these as axiomatically enclosed bodies of theory.
• And theory specifying systems that are axiomatically grounded as above, with at least some a priori assumptions built into them, but that are also at least as significantly grounded in outside-sourced information too, such as empirically measured findings as would be brought in as observational or experimental data. Think of these as axiomatically open bodies of theory.

Any general theory of business, like any organized body of scientific theory would fit the second of those basic patterns as discussed here and particularly in Part 27. My goal for this posting is to continue that line of discussion, and with an increasing focus on the also-empirically grounded theories of the second type as just noted, and with an ultimate goal of applying the principles that I discuss here to an explicit theory of business context. That noted, I concluded Part 27 stating that I would turn here to at least begin to examine:

• The issues of completeness and consistency, as those terms are defined and used in a purely mathematical logic context and as they would be used in any theory that is grounded in descriptive and predictive enumerable form. And I will used that more familiar starting point as a basis for more explicitly discussing these same issues as they arise in an empirically grounded body of theory too.
• How new axioms would be added into an already developing body of theory, and how old ones might be reframed, generalized, limited for their expected validity and made into special case rules as a result, or be entirely discarded as organizing principles per se.
• Then after addressing that set of issues I said that I will turn to consider issues of scope expansion for the set of axioms assumed in a given theory-based system, and with a goal of more fully analytically discussing optimization for the set of axioms presumed, and what that even means.

And I begin addressing the first of those points by citing two landmark works on the foundations of mathematics:

• Whitehead, A.N. and B. Russell. (1910) Principia Mathematica (in 3 volumes). Cambridge University Press.
• And Gödel’s Incompleteness Theorems.

Alfred North Whitehead and Bertrand Russell set out to develop and offer a complete axiomatically grounded foundation for all of arithmetic, as the most basic of all branches of mathematics in their above-cited magnum opus. And this was in fact viewed as a key step realized, in fulfilling the promise of David Hilbert: a renowned early 20th century mathematician who sought to develop a comprehensive and all-inclusive single theory of mathematics as what became known as Hilbert’s Program. All of this was predicated on the validity of an essentially unchallenged metamathematical axiomatic assumption, to the effect that it is in fact possible to encompass arbitrarily large areas of mathematics, and even all of validly provable mathematics as a whole, into a single finite scaled, completely consistent and completely decidable set of specific axiomatic assumptions. Then Kurt Gödel proved that even just the arithmetical system offered by Whitehead and Russell can never be complete in this sense, from how it would of necessity carry in it an ongoing requirement for adding in more new axioms to what is supportively presumed for it, and unending and unendingly so if any real comprehensive completeness was to be pursued. And on top if that, Gödel proved that it can never be possible to prove with comprehensive certainty that such an axiomatic system can be completely and fully consistent either! And this would apply to any abstractly, enclosed axiomatic system that can in any way be represented arithmetically: as being calculably enumerable. But setting aside the issues of a body of theory facing this type of limitation simply because it can be represented in correctly formulated mathematical form, for the findings developed out of its founding assumptions (where that might easily just mean larger and more inclusive axiomatically enclosed bodies of theory that do not depend on outside non-axiomatic assumptions for their completeness or validity – e.g. empirically grounded theories), what does this mean for explicitly empirically grounded bodies of theory, such as larger and more inclusive theories of science, or for purposes of this posting, of business?

I begin addressing that question, by explicitly noting what has to be considered the single most fundamental a priori axiom that underlies all scientific theory, and certainly for all bodies of theory such as physics and chemistry that seek to comprehensively descriptively and predictively describe what in total, would include the entire observable universe, and from its big bang origins to now and into the distant future as well:

• Empirically grounded reality is consistent. Systems under consideration, as based at least in principle on specific, direct observation might undergo phase shifts where system-dominating properties take on more secondary roles and new ones gain such prominence. But that only reflects a need for more explicitly comprehensive theory that would account for, explain and explicitly describe all of this predictively describable structure and activity. But underlying that and similar at-least seeming complexity, the same basic principles and the same conceptual rules that encode them for descriptive and predictive purposes, hold true everywhere and throughout time.
• To take that out of the abstract, the same basic types of patterns of empirically observable reality that could be representationally modeled by descriptive and predictive rules such as Charles’ law, or Boyle’s law, would be expected to arise wherever such thermodynamically definable systems do. And the equations they specify would hold true and with precisely the same levels and types of accuracy wherever so applied.

So if an axiomatically closed, in-principle complete in and of itself axiomatic system, and an enclosed body of theory that would be derived from it (e.g. Whitehead’s and Russell’s theory of arithmetic) cannot be made fully complete and consistent, as noted above:

• Could grounding a body of theory that could be represented in what amounts to its form and as if a case in point application of it, in what amounts to a reality check framework of empirical observation allow for or even actively support a second possible path to establishing full completeness and consistency there? Rephrasing that, could the addition of theory framing and shaping outside sourced information evidence, or formally developed experimental or observational data, allow for what amounts to an epistemologically meaningful grounding to a body of theory through inclusion of an outside-validated framework of presumable consistency?
• Let’s stretch the point made by Gödel, or at least risk doing so where I still at least tacitly assume bodies of theory that can in some meaningful sense be mapped to a Whitehead and Russell type of formulation of arithmetic, through theory-defined and included descriptive and predictive mathematical models and the equations they contain. Would the same limiting restrictions as found in axiomatically enclosed theory systems as discussed here, also arise in open theory systems so linked to them? And if so, where, how and with what consequence?

As something of an aside perhaps, this somewhat convoluted question does raise an interesting possibility as to the meaning and interpretation of quantum theory, and of quantum indeterminacy in particular, with resolution to a single “realized” solution only arrived at when observation causes a set of alternative possibilities to collapse down to one. But setting that aside, and the issue of how this would please anyone who still adheres to the precept of number: of mathematics representing the true prima materia of the universe (as did Pythagoras and his followers), what would this do to anything like an at least strongly empirically grounded, logically elaborated and developed theory such as a general theory of business?

I begin to address that challenge by offering a counterpart to the basic and even primal axiom that I just made note of above, and certainly for the physical sciences:

• Assume that a sufficiently large and complete body of theory can be arrived at,
• That would have a manageable finite set of underlying axiomatic assumptions that would be required by and sufficient to address any given empirically testable contexts that might arise in its practical application,
• And in a manner that at least for those test case purposes would amount to that theory functioning as if it were complete and consistent as an overall conceptual system.
• And assume that this reframing process could be repeated as necessary, when for example disruptively new and unexpected types of empirical observation arise.

And according to this, new underlying axioms would be added as needed, when specifically faced and once again particularly when an observer is faced with truly novel, disruptively unexpected findings or occurrences – of a type that I have at least categorically raised and addressed throughout this blog up to here, in business systems and related contexts. And with that, I have begun addressing the second of the three to-address topics points that I listed at the top of this posting:

• How would new axioms be added into an already developing body of theory, and how and when would old ones be reframed, generalized, limited for their expected validity or discarded as axioms per se?

I am going to continue this line of discussion in a next series installment, beginning with that topics point as here-reworded. And I will turn to and address the third and last point of that list after that, turning back to issues coming from the foundations of mathematics in doing so too. (And I will finally turn to and more explicitly discuss issues raised in a book that I have been citing here, but that I have not more formally gotten to in this discussion up to here, that has been weighing on my thinking of the issues that I address here:

• Stillwell, J. (2018) Reverse Mathematics: proofs from the inside out. Princeton University Press.)

Meanwhile, you can find this and related material about what I am attempting to do here at About this Blog and at Blogs and Marketing. And I include this series in my Reexamining the Fundamentals directory and its Page 2 continuation, as topics Sections VI and IX there.

Reconsidering Information Systems Infrastructure 8

Posted in business and convergent technologies, reexamining the fundamentals by Timothy Platt on February 11, 2019

This is the 8th posting to a series that I am developing here, with a goal of analyzing and discussing how artificial intelligence, and the emergence of artificial intelligent agents will transform the electronic and online-enabled information management systems that we have and use. See Ubiquitous Computing and Communications – everywhere all the time 2 and its Page 3 continuation, postings 374 and loosely following for Parts 1-7. And also see two benchmark postings that I initially wrote just over six years apart but that together provided much of the specific impetus for my writing this series: Assumption 6 – The fallacy of the Singularity and the Fallacy of Simple Linear Progression – finding a middle ground and a late 2017 follow-up to that posting.

I have been developing and offering a foundational discussion for thinking about neural networks and their use, through most of this series up to here, and more rigorously so since Part 4 when I began discussing and at least semi-mathematically defining and characterizing emergent properties. And I continue that narrative here, with a goal of more cogently and meaningfully discussing the neural network approach to artificial intelligence per se.

I have been pursuing what amounts to a dual track discussion in that, as I have simultaneously been discussing both the emergence of new capabilities in functionally evolving systems, and the all too often seemingly open-ended explosive growth of perceived functional building block needs, that might arguably have to be included in any system that would effectively carry out more complex intelligence-based activities (e.g. realistic and human-like speech and in a two-way conversational context: natural speech and conversation as first discussed here in Part 6.)

Let’s proceed from that point in this overall narrative, to consider a point of significant difference between those new emergent capabilities that putative artificial intelligence agents develop within themselves as a more generally stated mechanism for expanding their functional reach, and the new presumed-required functional properties and capabilities that keep being added through scope creep in systems design if nothing else for overall tasks such as meaningfully open ended two way, natural conversation.

• When new task requirements are added to the design and development specifications of a human-directed and managed artificial intelligence agent and its evolution, for carrying out such tasks, they are added in there in a directed and overall goal-oriented manner and both for their selection and for their individual component by component design.
• But when a system develops and uses new internally-developed emergent property capabilities on its own, that development is not necessarily end-goal directed and oriented and in anything like the same way. (The biological systems-framed term exaptation (which has effectively replaced an older presumed loaded term: pre-adaptation, comes immediately to mind in this context, though here I would argue that the serendipitous and unplanned for of pre-adaptation might make that a better term in this context.)

Let me take that out of the abstract by citing and discussing a recent news story, that I will return to in other contexts in future writings too, that I cite here with three closely related references:

One Giant Step for a Chess-Playing Machine,
A General Reinforcement Learning Algorithm that Masters Chess, Shogi, and Go Through Self-Play and
Chess, a Drosophila of Reasoning (where the title of this Science article refers to how Drosophila genetics and its study have opened up our understanding of higher organism genetics, and its realistic prediction that chess will serve a similar role in artificial intelligence systems and their development too.)

The artificial intelligence in question here is named AlphaZero. And to quote from the third of those reference articles:

• “Based on a generic game-playing algorithm, AlphaZero incorporates deep learning and other AI techniques like Monte Carlo tree search to play against itself to generate its own chess knowledge. Unlike top traditional programs like Stockfish and Fritz, which employ many preset evaluation functions as well as massive libraries of opening and endgame moves, AlphaZero starts out knowing only the rules of chess, with no embedded human strategies. In just a few hours, it plays more games against itself than have been recorded in human chess history. It teaches itself the best way to play, reevaluating such fundamental concepts as the relative values of the pieces. It quickly becomes strong enough to defeat the best chess-playing entities in the world, winning 28, drawing 72, and losing none in a victory over Stockfish.” (N.B until it met AlphaZero, Stockfish was the most powerful chess player, human or machine on Earth.)

Some of the details of this innovative advance as noted there, are of fundamental game-changing significance. And to cite an obvious example there, AlphaZero taught itself and in a matter of just a few hours of self-development time to become by far the most powerful chess player in the world. And it did this without “benefit” of any expert systems database support, as would be based in this case on human chess grandmaster sourced knowledge. I put “benefit” in quotes there because all prior best in class computer-based, artificial intelligence agent chess players have been built around such pre-developed database resources, and even when they have been built with self-learning capabilities built in that would take them beyond that type of starting point.

I will cite this Science article in an upcoming series installment here, when I turn to address issues such as system opacity and the growing degradation and loss of human computer programmer understanding as to what emerging, self-learning artificial intelligence systems do, and how. My goal here is to pick up on the one human-sourced information resource that AlphaZero did start its learning curve from: a full set of the basic rules of chess and of what is allowed as a move and by what types of chess pieces, and of what constitutes a win and a draw as a game proceeds. Think of that as a counterpart to a higher level but nevertheless effectively explanatory functional description of what meaningful conversation is, as a well defined functional endpoint that such a system would be directed to achieve, to phrase this in terms of my here-working example.

Note that AlphaZero is defined by the company that owns it: Alphabet, Inc., strictly as software and as software that should be considered platform-independent as long as the hardware that it is run on has sufficient memory and storage and computational power to support it, and its requisite supportive operating system and related add-on software.

But for purposes of this discussion, let’s focus on the closed and inclusive starting point that a well defined and conclusive set of rules of the game hold for chess (or for Shogi or Go for AlphaZero), and the open-ended and ultimately functionally less informative situation that would-be natural conversation-capable artificial intelligence agents face now.

This is where my above-made point regarding self learning systems really enters this narrative:

• … when a system develops and uses new internally-developed emergent property capabilities on its own, that development is not necessarily end-goal directed and oriented and in the same way.

That type of self-learning can work and tremendously effectively so and even with today’s human-designed and coded starting-point self-learning algorithms and with a priori knowledge bases in place for them – if an overall goal that this self-learning would develop towards: a clear counterpart to the rules of chess here, is clearly laid out for it, and when such an agent can both learn new and question the general validity of what it has built into it already as an established knowledge base. When that is not possible, and particularly when a clear specification is not offered as to the ultimate functional goal desired and what that entails … I find myself citing an old adage as being indicative of what follows:

• “If you don’t know where you are going, any road will do.”

And with that offered, I will turn in my next series installment to offer some initial thoughts on neural network computing in an artificial intelligence context, and where that means self-learning and ontological level self evolution. And yes, with that noted I will also return to consider more foundational issues here too, as well as longer-term considerations as all of this will come to impact upon and shape the human experience.

Meanwhile, you can find this and related postings and series at Ubiquitous Computing and Communications – everywhere all the time and its Page 2 and Page 3 continuations. And you can also find a link to this posting, appended to the end of Section I of Reexamining the Fundamentals as a supplemental entry there.

%d bloggers like this: