Platt Perspective on Business and Technology

Reconsidering Information Systems Infrastructure 6

Posted in business and convergent technologies, reexamining the fundamentals by Timothy Platt on September 20, 2018

This is the 6th posting to a series that I am developing here, with a goal of analyzing and discussing how artificial intelligence, and the emergence of artificial intelligent agents will transform the electronic and online-enabled information management systems that we have and use. See Ubiquitous Computing and Communications – everywhere all the time 2 and its Page 3 continuation, postings 374 and loosely following for Parts 1-5. And also see two benchmark postings that I initially wrote just over six years apart but that together provided much of the specific impetus for my writing this series: Assumption 6 – The fallacy of the Singularity and the Fallacy of Simple Linear Progression – finding a middle ground and a late 2017 follow-up to that posting.

I began discussing emergent properties as they would arise in systems of what at least begin individually as simple, artificial specialized intelligent artificial agents, in Part 4 and continued to explore that complex of issues in Part 5. One of the core details that I offered there, that I will build from as this narrative progresses is an at least first-take mathematical definition of what an emergent property is. I suggest you’re reviewing Part 5 for that, as what follows here builds very specifically from it as I further explore what emergent properties are, and how they arise in complex systems of simpler components.

First, to orient this overall discussion thread in terms of where it is leading, I offer the following set of points of consideration, starting from the at least semi-mathematical approach taken in Part 5 for specifying when it can be argued that emergent properties are arising at some organizational level in a system of individually simpler artificial intelligence agents:

• Functional outputs and their associated functionalities that are not a priori specified in lower organizational levels from where these process outcomes might be first arising,
• That go beyond or disruptively extend or expand upon what would nominally be expected to arise from the functioning of the a priori set algorithms and their rote application as they functionally exist at those lower organizational levels, would of necessity involve and even require emergent processes and their associated properties. That is all inherent to the basic definition that I have offered here of emergent properties per se. But not all emergent properties would indicate intelligence in and of themselves, or even just indicate a developmental move in that direction as might arise in a complex system as it ontologically develops and more fully emerges.
• A true emerging intelligence, and of a form that would suggest at least meaningful development in the direction of an arguably general intelligence, would display a capacity to organize and analyze and use information in new descriptive and/or predictive ways that could not be directly accounted for through either the application of available a priori expert system knowledge, direct application of the preset a priori-provided algorithms in place for processing and managing that information, or newly added data that such algorithms could act upon, or some directly constructable combination thereof as could be carried out by simple artificial specialized intelligent agents per se.
• This of course, sets out to define intelligence, and general intelligence in particular in terms of what it is not, rather than in positively stated operationalized terms. This also posits at least one of what I would argue to be the key building block elements of what would go into creating a true artificial general intelligent agent, but only in “none of the above” terms. But it is among my core goals of this series to at least shed some light on what artificial general intelligence means and can mean in more positive terms, and in terms that might point in the direction of its operationalized realization.
• Note: I will also discuss at least one other such foundational building block in this series too: awareness as a capability to develop descriptive and predictive models of context and possible next-step outcomes, and beyond that self-awareness too. I simply note that here in order to further orient this discussion thread with the series as a whole, and its intended overall message.

Let me begin addressing at least some of this above list of points by reframing the issues that I have been addressing here, in terms that should in principle be directly empirically testable and in the context of actual systems of artificial intelligent agents as they are studied for how they function. And I offer a first step in that direction here by noting that:

• Emergent properties enter this type of results analysis when emergent outputs are found to that fit a pattern that can most easily and parsimoniously be described as the output results of at least one new, emergent functioning agent there, separate from what can be found in the inventory of agents more directly ascertainable as being availably present in the overall artificial intelligence system under consideration, and certainly as that system was initially developed. (Note that I propose this in terms of Occam’s Razor, which could lead to faulty results in the context of emergent properties where a true “simplest” and “most parsimonious” might not be apparent, and certainly a priori. I also posit this in terms of emergent development, as such a system functions and changes as a result. And I assume a self-learning and self-adjusting capability there too, as found in neural network designed systems.)
• The context of this happening might be stably consistent and predictable, but could not be derived in form or detail from study of the component parts or of simpler systems arrays of them that reside below the level where this activity first arises, and certainly when considered at a more reductionistic level.
• So what would be the simplest, most parsimonious agent design, that at least in broad brushstroke outline might be expected to in fact produce those novel, and at least apparently emergent outputs from the array of outputs that simpler components below it would not be expected to be able to provide, and even categorically on their own?
• I wrote the above in terms of process and agent outputs and where they feed into the larger systems that they belong to as next step input. Input per se to those simpler agents and assemblies of them, would be important there too and as a general consideration, given the requirements of developing and maintaining feedback capabilities in these systems. So I am writing here of a need to identify and discern the nature of what might be anomalous output types (and emergent agents that generate them), from a simple agent perspective but also from the perspective of the entire system as an organized, integrated whole.

I ambitiously wrote in Part 5, of the ultimate need to be able to specify such a system in terms of the hardware, software, and firmware required, and in terms of what would best be developed and included as database resources that would serve as a counterpart to human long-term and short-term memory for it, briefly considering that level of design and development consideration. And I did so with a goal of developing these earlier discussion steps that I am delving into here, in that direction. From a more software and data perspective, actually designing and building an artificial general intelligence (or rather an embryonic form that could develop into one), would of necessity have to include a determination of the types and volumes of starter data and of pre-processed knowledge that could be developed from it that such a system would be initially primed with, as well as the starter software that its neural network-based growth and development would begin emerging from. And this would require at least all of the minimally necessary hardware support too, that would also be expanded out as needed. But the above just stated bullet points, and the second from the last of them in particular, with its call for a simplified model of what it would take to provide those novel outputs, would not call for such a detailed understanding, and certainly not up-front. The goal there is to arrive at a cartoon-like representation, analogous perhaps to a structurally simplified circuit of the type that would be specified by a Thévenin’s theorem, to cite an electronics circuit analogy. In this case this would mean a general outline-oriented, simplified flow diagram-type representation of the key algorithmic formatted pieces, or rather their functional roles and goals that would likely enter into achieving the types of output results observed.

Let me explain that in a little more detail, by citing the learning curve challenges and approaches that are being attempted to create an artificial intelligence agent that is capable of holding a meaningful conversation, and if possible with as much flexible capability as to be able to pass a Turing test challenge, and certainly as that test design was initially conceived.

First, let’s consider the learning curve example of attempting to build an artificial intelligence system that can convincingly hold a simple, and ideally at least somewhat open-ended seeming conversation that actually conveys at least something of a sense of mutual understanding and connection. More explicitly, I refer here to agents that are not just limited in how they can converse, to tightly choreographed “dialogs” that fit highly limited parameters as to what words and syntax are going to be allowed, as exhibited in their extreme form in simple and overtly non-intelligent automated phone menu systems of the type we all face when a robo-voice says “if you want A say or press 1, and if you want B say or press 2.” I add that these same basic limitations apply at least categorically to all current automated conversational input and response systems currently publically available too, such as Alexa or Siri or their current peers, even if they do show wider functional flexibility than that first example automated phone support system does. See:

Alexa vs. Siri vs. Google: Which Can Carry on a Conversation Best?.

This news piece reads as a briefly organized consumer reports-style article on how a short set of digital assistants compare functionally when faced side to side with a same set of test case tasks – in this case a same set of conversation-framed help requests. But I cite this for a brief supporting details element added in to flesh out that narrative flow: a brief and simplified outline of what an artificial intelligence conversational agent would require in its array of simpler agent subsystems for it to work. To reiterate and then start building from that list for purposes of this narrative, this Times piece posited that such a system would include, among other elements:

• A subsystem that tries to recognize each word and convert it to text.
• A subsystem that tries to understand the meaning of each word and how it relates to the others.
• A third subsystem that “spits out” new text that responds to what you’ve said.
• And a fourth one that converts this response to digital speech. (This is actually the one piece to this puzzle that was offered in the above-cited news piece that is actually fairly well developed already, with the current text to speech software already available. I include it here, primarily for completeness.)

I retained the original wording of those four points where possible, while still fitting their overall message into this context grammatically. And my goal here is to discuss both how and why each of the four simpler subsystems identified there, and the first three of them in particular are in fact going to have to be realized through the development of very complex artificial intelligent agent arrays, and with each element of those arrays carrying out complex tasks. And I will also at least briefly note how the functioning of each of them would critically depend on feedback-supporting connectivity between all of them. Think of the above four as a complete group, as representing a first-take effort to encompass one of the above proposed Thévenin-like simplifications that I have suggested in an emergent properties context, in a few simple text clauses (as restated here.)

Let’s consider the first of them in a little more detail, at least categorically breaking open its black box shell for illustrative purposes. And more specifically, let’s just focus on the first half of that as representing a large enough functional chunk to qualify as a separate subsystem level element and array of them on its own: “a subsystem that tries to recognize each word.”

• First of all this is clearly a task requirement that is database and expert systems driven. Collectively, the words encompassed here would have to at least begin with the roughly ten thousand or so most commonly used words that an average member of the community of users that these digital assistants would communicate with, uses and routinely so. This subset of the (for now presumed to be just one) language that would be used here, would be available as a basic minimum database resource in the most rapidly accessible areas of active memory in the system, making it available for immediate use, and with an added capability for digging into storage for any words that fall outside of that range, readily available too. The rationale behind this is simple: speed and efficiency.
• But even there, I have hopelessly oversimplified a much more complex set of problems. Input that this overall conversational agent receives would be delivered to it verbally and people speak differently and with different within-country regional and other accents and pronounciations, if nothing else. And then there are homophones: same sounding words that mean different things. And there are colloquial expressions that add whole new types of meaning to even what would ostensibly be simple, one meaning words. And the same word can in fact have several or even many different meanings, overt and more nuanced but with all of those points of distinction at least potentially important to overall meaning conveyed. And I have just scratched the surface here, of how complications can arise from attempting to address this challenge with a simple single word by single word database look-up system. Such a system will have to be able to analyze the context that each of these words appears in too, and both within its more immediate context and even if necessary from the conversation as a whole. So adequately addressing task one of the Times article list (or rather its first half) would also call for close and active feedback-driven collaboration with the array of agents responsible for the second of those tasks too.
• And cutting ahead to the third subsystem as listed above: a reply capacity, and bringing feedback and systems interconnectivity into this narrative, this is where the conversational agent that this fits into, will have to be able to ask clarifying questions in order to resolve ambiguity or error as arise in addressing that task one, first half part of this overall system too.
• These systems, and both as a whole, and at lower organizational levels, are going to have to be driven by probabilistic analyses and determination of what was most likely meant in a conversational statement, rather than by a more deterministic establishment of some one and only possible meaning of the input so received. Human to human conversation, I add, works that way too as is illustrated by the very human capacity to misunderstand or incorrectly infer that can and does arise when people converse with each other and even when really paying attention. Robotic systems will of necessity face that same limitation (and sometimes strength too.) And breaking away from a seemingly easier to develop deterministic approach here, for a chance at creating a more overall-effective probabilistic analysis-driven one, would increase the complexity of these arrays and very significantly too.

And this is still just a briefly stated and gap ridden expansion of that originally stated subsystem and what would go into it, and one that has primarily just focused on one half of one of the task elements listed in the Times piece example that I have been considering here. A real conversation, for example, has to be able to move along fairly quickly, but the above first take on flow diagramming its basic needs would be incredibly slow and cumbersome and certainly if this agent had to look up every possible word in its memory system for a best meaning and usage match, and every single time. So optimization becomes essential there, with that calling for at least some combination of:

• Phonemic analysis to help facilitate and speed up word perception and use,
• A basic recognition of the standard frequency of use of words in that stored vocabulary, and certainly among the here-stated ten thousand or so most frequently used of them. The definite and indefinite articles: “the” and “a” are for example used more than a word like “agile” is, and are a lot more often used than words such as “sesquipedalian” are (that for most of us would not fit into our top ten thousand anyway.)
• Though such a system has to be able to flexibly accommodate and learn from the word usage of the people they speak with, so if someone they do deal with uses a word like “agile” or “sesquipedalian” (or the phrase “agile sesquipedalian”) they can move it into a more frequently used list for them – and as entries to an individualized customized, mid-level access word list for conversing with them if nothing else.

Artificial general intelligence is so difficult because every task or subtask that we seek to address seems to explode upon more detailed analysis, into a complex subsystem in its own right, and every deeper level subtask within that seems to do so too. That, I add brings me back to the driving need for a counterpart to the Thévenin approach to understanding key aspects of more complex circuits and their power needs and restrictions, for outline modeling these systems too – and in ways that can help rein in this seemingly open-ended expansion, at least as a means of finding meaningful starting points to design and build from – and with room for ontological development of these systems that would help them to identify and fill in gaps in themselves as those issues emerge within them, as a part of that basic starter design.

With that last detail noted here, I am going to turn in the next installment to this series, to consider one of the 800 pound gorillas in the room that I have pointed towards a few times here in this series, but that I have not actually addressed in any real sense up to now in it: neural networks as they would enter into and arise in artificial intelligence systems, and in theory and in emerging practice – and their relationship to a disparaging term and its software design implications that I first learned in the late 1960’s when first writing code as a teenager: spaghetti code. Note: spaghetti code was also commonly referred to as GoTo code, for the opprobrium that was placed on the GoTo command for how it redirected the logical flow and execution of software that included it (e.g. in programs with lines in them such as: “if the output of carrying out line A in this program is X, then GoTo some perhaps distantly positioned line Z of that program next and carry that out and then proceed from there, and wherever that leads to and until the program finally stops running.”) Hint: neural network designs find ways to create lemonade out of that, while still carefully and fully preserving all of those old flaws too.

I will proceed from there to pick up upon and more explicitly address at least some of the open issues that I have raised here up to now in this series, but mostly just to the level of acknowledging their significance and the fact that they fit into and belong in its narrative. And as part of that, I will reconsider the modest proposal artificial intelligence example scenario that I began this series with. Meanwhile, you can find this and related postings and series at Ubiquitous Computing and Communications – everywhere all the time and its Page 2 and Page 3 continuations. And you can also find a link to this posting, appended to the end of Section I of Reexamining the Fundamentals as a supplemental entry there.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: