Archive for the ‘Technology’ Category

Some thoughts on “Conversational Commerce”

January 20, 2016

I’ve largely neglected this blog lately (too focused on getting Querki off the ground), but this post from Chris Messina got me thinking.

He’s talking up 2016 as the year of “Conversational Commerce” — saying that the coming year will be the time when many companies begin to figure out how to leverage the various chat streams, listening to what users are saying, taking commands that way, and providing services through it.

It wouldn’t surprise me if he’s right about the core point: that companies are going to start aggressively plugging into the chat networks and leveraging them.  But let’s get past the happy dreams of e-commerce riches and look at the implications.

My general reaction to all of this is mild skepticism — not so much that the companies will embrace it, as that the users will.  In particular, my BS detector kind of got pegged by this line:

While you may have bristled when that news app alerted you to “new stories”, you might appreciate a particularly friendly newsbot delivering a personalized recommendation with context that you uniquely care about.

I think he’s underestimating the creepiness factor here, and how people react to intrusions in their conversational stream.  Yes, folks are getting somewhat desensitized to it over time, but I’ve found few who “appreciate” it.  I’m constantly talking to folks who are subtly unsettled by how much the bots, collectively, know about them.  And contrary to the wishful thinking of the various companies, not many people like them.

In general, folks don’t like uninvited intrusion.  We’re all rapidly learning to work around it in news feeds and the like — one can’t survive long on Facebook without developing the mental reflex that renders advertisements and promotions invisible.  But in any sort of true conversational context, it just feels rude to be interrupted.  Too many marketers are forgetting the psychological lesson of spam: when you intrude into an electronic space that people think of as personal, they don’t just quietly ignore it, they get angry.  And you don’t want customers angry at your brand.

There’s a tragedy of the commons here.  If the conversational tools make it possible for commerce to intrude into them, that will be abused by over-eager marketers and technologists.  And at that point, you quickly get into the traditional problem, that bad traffic drives out good.  The line between “good” and “bad” isn’t just fuzzy, it’s entirely subjective — different users will object to different intrusions.  And it won’t take many bad interactions to turn people off the idea entirely, and get them to demand off global off switches.

What about requested interactions?  He also makes the point that customers could initiate operations with all of those bots through the conversational stream, and that does make a lot of sense — I can see some real appeal to being able to make requests and have them serviced quickly, without interrupting my flow.

(For example, someone on one of my Gitter feeds the other day introduced a little bot that lets you evaluate expressions right in the conversation — it’s great for illustrating technical points, and folks have taken to it quickly.)

But then he undermines the point:

Discovery of discreet conversational services becomes less of an issue if users are slowly trained to think and type more like programmers.

Annnnd we’re back to wishful thinking.  For decades, the programming community has been like Henry Higgins, bemoaning all those Eliza Doolittles out there and wondering why they can’t be more like us.

Basically, the idea here is that these service-oriented bots become much easier to code if the users would just type in proper commands to them.  The example shown is:

/partyline create:task Write about the future of text-based interfaces

Yes, that’s easier for the program to understand.  But even this simple syntax is going to be enough to turn off the vast majority of customers.  The real core ones, the folks who depend on your tool day-to-day, who are willing to invest real brain cells in it, sure — they have enough skin in the game to make the effort.  But it’s hard to build a business plan around just that hard core.

Somewhere, I’ve got a button that reads, “If it has syntax, it isn’t user-friendly”.  Much though we might wish otherwise, it’s still true.  There might come a day when the average person is comfortable with precise command syntax, but I’d bet that we’re still quite a number of years off.

(This topic is near and dear to my heart, since template formatting is a key feature of Querki.  I wound up writing a whole new programming language, just to make it as easy as conceivably possible — and I’m still quite sure that we’re going to need a WYSIWYG wizard on top of that for most users.)

Is it steam-engine time for conversational interfaces?  Probably — the technology is there, and there are uses.  But let’s not forget that we’re in the “hype” part of the cycle here: the reality is going to be more gradual and humdrum.  Syntax-driven interfaces like the one shown above are going to be a niche market — the companies are going to have to invest serious time and money into more naturalistic parsers if they’re going to succeed.  And everyone involved in this growing ecosystem needs to be careful about allowing too much intrusion into the users’ conversational streams.  Otherwise, 2018 will be the year when customers, en masse, begin to reject Conversational Commerce…

So what *should* the identity architecture look like?

February 1, 2012

[Crossposted to Google+, LiveJournal and Art of Conversation. That, in and of itself, illustrates some of the points I’m making.]
I’ve posted a lot (mainly on Google+) about the problems with the way Google is handling identity, and the various dangers of it. The just-linked article describes neatly why Google wants to mess up the identity architecture. But it’s worth spelling out the alternative, and how it should work to be best for the users.

I’ve been meaning to do a long writeup for months now, but keep getting distracted, so here’s the back-of-the-napkin summary. Consider it a sort of technical manifesto.

(Yes, this is the short version. It’s a quick and dirty writeup, just the spark for a lot more discussion. And most of it isn’t that novel: others have talked about it, but haven’t gotten far enough yet.)

There are, in principle, four principal layers in a well-constructed Internet identity architecture. I’m not going to go into the fine details, because from this viewpoint they don’t matter as much — what really matters is how they relate to each other. Suffice it to say, none of this is easy, but it’s all technically feasible if folks collectively want it enough.

All of these should be talking to each other through *open* protocols, with no back doors. That’s extremely important: the point of the exercise is that the individual should be able to control each of these layers him or herself. Even more importantly, no single company should be able to lock you into their stack: if you really value your privacy highly, you should get each of these from separate companies.

(I can’t overstate the importance of this. The success of the Internet has largely been due to its embrace of open protocols like IP, TCP, HTTP and so on. It is a travesty that the social network space has festered without them like this.)

The layers are:

Layer 1: Identity — this is the simple statement of “this is me”. Crucially, I should be able to have multiple of these, defined however I like. In my own case, there’s “Mark Waks” (the professional / business identity) and “Justin du Coeur” (the social / club identity). These Identities may provide additional details such as name, gender, or what-have-you, but don’t need to: all they really need to do is provide an authentication mechanism.

We already have Layer 1, in a couple of different forms. There are SOAP-based versions in the form of the WS-Security stack, and those are fairly elegant and well designed. In practice, OpenID is cruder but much more prevalent, works adequately for many purposes, and is used a lot. (Although not nearly as much as it should be.)

Layer 2: Social Grouping — this is the notion of G+ Circles, FB Lists, LJ Flists, and so on: groups of people that you define. These Groups may be public (everyone can see their existence) or private (only you know they exist). A Group is owned by one or more Identities, and contains any number of Identities. Note that a Group does *not* contain people, it contains Identities. One of the core principles here is that people know each other as Identities; how much they know about the relationship of a person and an Identity is a relatively private matter. (That is, lots of people know that “Mark Waks” is “Justin du Coeur”, but that should be a decision I control, not enforced by the software. The former should be in groups about these sorts of technology matters, the latter in discussions of the SCA and fandom.)

There have been some stabs at doing this properly, at least to the extent of sharing group information between applications. I don’t get the feeling that anybody has taken it seriously enough yet, and some providers (notably Facebook) deliberately make life difficult. But it’s been examined a lot.

Layer 3: Application — this is conceptually the top of the stack, but it interacts with the other layers in fairly subtle ways. This is all the stuff you can *do* online. In principle, all functionality belongs here, and shouldn’t get mixed in with the other layers.

Most systems get this wrong, mixing everything from personal information to chat into the Identity layer instead of formally separating it via APIs into a consistent Application layer. In particular, the big providers tend to treat applications as what everyone *else* gets to do, while privileging their own stuff. People have always objected when Microsoft does things like that; there is no reason for companies like Facebook and Google to get let off the hook.

There are some nascent proto-standards for this sort of thing, but I haven’t seen much agreement yet. It’s not going to be real until multiple companies are hosting applications using the same standard, and a fair number of companies are writing applications using it.

Layer 4: Aggregation — this is the elephant in the room, that everyone prefers to ignore, but it’s central to much of the privacy problem.

The thing is, if you really care about your privacy, you need to be able to control how your Identities relate to each other. The Identity provider, the Grouping provider, the Application provider — none of these should have to know about all of your Identities. Moreover, if one of them *does* own the collection of Identities, then they own you in a sense, and we fail the key objective of giving you control over your online world.

This is the heart of the various Google problems — I haven’t yet figured out whether they are being deliberately obtuse about this problem, or really don’t get it, or are struggling with its implications and (typically of Google) refuse to say anything at all until they have the one true solution built in-house, and are simply refusing to engage properly with the wider community. It was the heart of the issue with the Real Names policy (if everything has to be under a single real name, you get aggregated whether you like it or not), and it’s the heart of the issue with their new privacy policy (since it is now clear that you can’t separate your identities simply by using different apps).

Now of course, you *can* deal this today, by creating completely separate accounts and never letting them touch each other; that’s often recommended. But it’s a blithe non-answer, because the simple truth is that that’s horribly inconvenient. There simply isn’t good tool support for it, so at best it’s clunky.

This is the bit that’s actually technically challenging, because it affects the way the rest of the stack works. In principle, you want to be able to aggregate your *views* of applications — for example, be able to see all of the conversations that include all of your Identities in a single place. But doing this while getting real privacy means that the Applications have to be built in such a way that they can’t accidentally “leak” the relationships between the Identities, and that’s tricky. Still, it could likely be managed with a well-controlled environment, with well-defined APIs.

Separating things into clear layers like this, communicating via clear APIs, would improve the online social world in a lot of ways. It would level the playing field, letting in lots of competition in each of these spaces; at the same time, it would make it more economical to build new applications if you didn’t have to rewrite them for each social network.

And I should be clear: it’s entirely reasonable to cheat a bit. So long as a social network allows in outside versions of each of these protocols, there is nothing at all with it offering a full stack of all of them, integrated to make it easier for a naive user to get involved. Yes, there are some market risks with that sort of collusion, but let’s get real — most people want convenience, and do *not* care about things like privacy or openness. (Yes, they should. But the world doesn’t run on nice ideals.)

Why doesn’t it just happen? Plain and simply, because the above architecture doesn’t offer an obvious way to become a billionaire. In that, it’s much like the Internet itself. As an individual, you *want* the social network to be a commodity, the same way that the Internet is. But companies want to lock you into their walled gardens, because that’s how they get rich.

History points the way, though. Originally, the networks themselves were walled gardens — companies like Compuserve and Prodigy tried to lock you into their gardens, providing lots of features but not letting you walk outside. We didn’t put up with it then: we collectively instead went for the messy but inter-connected Internet, and those companies basically wound up in the dustbin.

And there’s no reason for us to put up with walled gardens now. The very fact that Facebook and Google+ (and Livejournal and and and) mostly don’t talk to each other illustrates how broken things are. That’s because each of those companies, ultimately, wants to own you and profit from you. We need to get away from that, and not *let* ourselves be owned.

How do we get there from here? Honestly, a lot of hard work on many peoples’ parts. Trying honest prototypes and experiments; agreeing standards; ultimately, building a system that does all the sorts of stuff that Facebook and Google+ do in a more open way. The public isn’t going to move away from them because of airy principles; they’re only going to move if we can build an alternative that is *better*, and demonstrate that to them. That’ll take patience.

But I do think it can be done — moreover, I think it *will* happen, because it is closer to what people want. Folks are pretty fed up with the split between the various social networks: it’s a real inconvenience for many people. It’s time to start building The Social Network, the social level corresponding to the unified Internet, so that we stop having to choose to fragment.

(And yes, I’m gradually talking myself into rebooting CommYou, with a radically different business plan…)

Task-oriented conversation is demonstrated again

November 8, 2010

Here’s an interesting little article in Ars Technica a little while ago.  The upshot: people having conversations via SMS/text follow pretty much the behaviour patterns you would expect from a focused conversation.

Basically, they built a mathematical model that describes what you’d expect from two people having a conversation that is about something — an initial burst of activity, then gradually trailing off — and then compared that against real-world SMS traffic data.  Not too shockingly, with some adjustment of tunable parameters, it matched.

There isn’t anything too surprising here, but there’s an important ramification: they’re playing with the mathematics that underlie conversation.  Task-oriented conversation follows some fairly regular patterns, and they’re expressing those patterns.  This likely has implications for people building conversation systems, since it gives you an idea of what to expect and how to optimize for it…

Constructive

October 25, 2010

This was posted recently, in the always-excellent webcomic XKCD:

And what about all the people who won't be able to join the community because they're terrible at making helpful and constructive co -- ... oh

As always when XKCD is at its best, it’s both funny and thought-provoking, and quite on-target.

Here’s the question is raises, though: what’s the comment equivalent of the Turing Test?  Is the issue “bot or not”, “spam or not?” or “helpful or not?” Most spambots would fail the test described here; would human-generated astroturf?  Is “constructive” the right measure to use, to distinguish between “should be posted” and not?  It might be — indeed, the product-placement industry is almost based on this concept, and it’s better than simply asking “Do you think this is a bot?”.  But now I find myself looking for the best word to usefully express, “should this be here or not?”

To Bundle, or not To Bundle, that is the Question

October 21, 2010

I just got an unusually formal email from Google, saying that Google Groups is dropping a lot of functionality.  Specifically, they will no longer support customized welcome messages, pages or file storage for groups.  Essentially, they are going to stop pretending that they are competing with Yahoo Groups, in favor of trying to do a better job on mailing lists and forums.

They are quite clear, however, that you can still have group files and pages — it’s just that you should do files through Google Docs, and pages through Google Sites.

On the one hand, this actually makes a good deal of sense.  One of Google’s big problems is that they have lots of systems that are overlapping, or often completely redundant.  Having two separate file-management systems is a bit silly, so refactoring and merging them makes sense.

That said, I worry that they’re missing a key aspect of group identity.  Saying, “You can upload a file, and make it accessible only to a group” is not the same thing as saying, “You can upload a file within your group”.  The functionality may be the same, but the perceived user experience is very, very different.  Context matters, especially when you’re mucking with communities.

And frankly, I find myself disappointed that they claim to be focusing on mailing lists and forums, because that’s not the interesting problem.  I would far rather that they focus on community and identity, which are really the interesting problems that have not yet been well-solved.  Forums are a good use case for those, and it’s possible that they’ll do a lot of good along the way, but I would much rather get a really great, shareable and repurposeable group-management system than just another mailing-list operator.

So we’ll see.  What do you think?  Does this change sound good, bad or indifferent?  Is Google going in the right direction, or are they missing the boat?

Co-editing and conversation

September 29, 2010

I found out today that Microsoft has finally added live co-editing to Word.  In Word 2010, you can have several people working in the same document simultaneously, seeing each other’s edits live as you go.

On the one hand, this is a useful and interesting feature.  I confess, I’d be more impressed if we hadn’t implemented more or less exactly this functionality at one of my earlier startups (Buzzpad) all the way back in 2002; I’m a little distressed by the fact that it’s taken MS this long to catch on.  But be that as it may, it’s still useful.

That said, I suspect that the process is going to turn out to be a bit weak.  (Caveat: I haven’t played with it yet, so I’m going by what the above post says.)  The reason is that they appear to have failed to think about the conversational nature of the interaction.

The thing is, when three of us are co-editing a document, we’re not doing so in isolation.  The co-editing is, usually, an interactive process, where each of us is reviewing each other’s changes, commenting on and tweaking them, and generally bouncing ideas off each other.  Sure, we can each edit in our own little silos, but that’s nowhere near as interesting and useful as a more interactive experience.

So we need to have a conversation as part of this.  As currently constituted, it looks like we need to do that out-of-band.  Microsoft would probably recommend opening up a Messenger conversation for it, and that works, but it’s not a great solution: it loses the document’s context, and the conversation is not itself preserved with the document, so it’s harder to go back later and reconstruct why you made the decisions you did.  As it stands, I suspect that I’ll wind up horribly abusing Word’s comment features to hold in-line conversations.

Moreover, this doesn’t do enough for the asynchronous side of the conversation.  In practice, we’ll usually be editing this document for a while; when I go away and come back, I want to clearly see the changes.  Moreover, I want to see the conversations that led up to those changes, so I can understand them properly.  You can get a bit of this with some of Word’s other features, but it doesn’t look well-integrated.

My guess is that MS decided to finally implement this capability because Wave scared them, and I have to say that I’m disappointed that they didn’t really learn from Wave: this is a comparatively naive-looking version of co-browsing.  The Wave notion, of a root blip (typically the document you’re co-editing) with deep conversations both embedded inside it and attached as later blips, takes the conversational side of co-editing much more seriously.  And the ability to quickly review all changes — both new conversation and edits to the blips — makes asynchronous conversation work pretty nicely.

So points to MS for trying, but it’s still pretty weak.  I hope they’ll keep evolving it in better directions, but I suspect that’ll only happen if the open-source Wave project continues to give them a good fright.

How about you?  Do you think you’d use Word’s new co-editing capability?  Is there anything that would make it better for you?

Okay, say it with me: Comments *are* Actions

May 21, 2010

So the good news from yesterday is that Google Buzz has opened up a bunch of APIs.  It’s officially a Labs project, so they’re doing it kind of tentatively (having been bitten in the ass by releasing Buzz itself too quickly and broadly), but by and large the new API looks pretty good.

But to my disappointment (although completely *not* surprise), it bakes flat commenting right into the data model.  If I’m reading this right, you can have “activity” objects (like a post), each of which has exactly one Comment Collection associated with it.

Why does this matter?  Because it makes the usual mistake of thinking about an “action” and a “comment” as completely different things.  They’re not, and it’s pretty broken to think about them that way.  In the larger online world, they’re just elements in the larger conversation that we are each having with our friends.

In practical terms, there are lots of implications here.  For example, by structuring things this way, it means that threaded discussions are right out — currently ruled out by the data model, and never likely to work quite right.  On the flip side, it has no concept of the other ways that an Activity can itself be a Comment — for example, a video, or another discussion, or something like that which is spawned off from a previous one.

None of which is new and different, mind.  It’s just a little depressing to see Google (which often does a good job of analyzing problems) making the same mistake that so many other sites have done.  That’s doubly true now, after Wave did a pretty good job on this.  (Although Wave then tried to do *so* much in the UI that it comes out as a little intimidating.  Their mistake was the opposite: trying to expose every conceptual detail to the user too quickly.)

The conclusion is that, while Buzz is decent at light-touch social-grooming sorts of communication (like Facebook), it’s not likely to ever be good at deep conversation (like LiveJournal) unless they wise up and fix this conceptual problem.  That’s a pity: the world needs more social networks that have a clue about how serious conversations really work…

Crowdsourcing can only take you so far

May 17, 2010

Interesting article here on ReadWriteWeb, about Facebook’s approach to banning.  It’s a bit hyperbolic, but assuming it’s correct (and really, it wouldn’t surprise me), it implies some dangerous naivete on Facebook’s part.

The high concept is that banning on FB is somewhat crowd-sourced — if a lot of people complain about someone, FB auto-bans them.  FB is claiming that this isn’t true, that all bans are reviewed; putting all the stories together, my guess is that the auto-ban *is* true, but that FB then reviews them after-the-fact.  That’s a plausible approach, but not a good one, since it means that a vengeful crowd can at least partly silence their detractors.

Mind, like I said, I don’t think it’s surprising: when you’re dealing with millions of users, including a fair number of trolls, and you have limited staff, you need *some* way to make things manageable.  But a simple numeric auto-ban (which this may well be) is too easy to abuse.  In our modern, polarized world, almost anybody who says anything really interesting is likely to have a crowd against them.

None of which means that an automated solution is impossible or evil — it just means that you need to be smart.  The story implies, quite plausibly, that there is a Facebook page dedicated specifically to listing people to attack with complaints, to get them kicked off.  If so, a smart network-detection system can pick it up.  If twenty completely random people complain about someone, the target is probably a troll.  If the *same* twenty people complain about person after person, then it’s much more likely that the complainers are the trolls (or at least, are abusing the system) — and *they* are the ones who should be banned instead.  At the least, it indicates that something suspicious is going on here, and the automated systems shouldn’t be trusted to make a decision without a human looking into it in detail.

Social networks are bigger and in some ways more complex than anything else the world has ever tried to grapple with.  That demands both cleverness, and openness about how you are managing them so that people can poke at those management techniques and find their holes.  I suspect Facebook is failing on both counts.

How would you deal with this?  Do you think automated mechanisms are even legitimate for deciding who to ban?  What tweaks should such a system put into place, to make it harder to abuse?

And speaking of Twitter, let’s talk Metadata

April 19, 2010

Another Twitter topic for today, possibly even more interesting: they’ve finally woken up to the value of metadata.

This one’s not a surprise to me at all — it was in the plans for CommYou, and I’ve always thought that it was necessary.  The thing is, when you’ve got a service like Twitter, that is fundamentally about Text Dammit, you have to wrestle with the question about what to do with the rest of the world.  I mean, there is a lot more to a modern online conversation than just text: pictures, video, even embedded games and such can matter enormously.

There are a variety of ways to deal with this — for example, Wave chose to define an open API so that, if you format your other stuff properly, it can be embedded inside a wave no matter what it is.  Twitter is going a different and arguably more open route, pretty much the same one I was planning on: let people embed whatever metadata they want inside the conversation, and let the Twitter clients decide what to do with it.

(For the non-programmers out there: “metadata” is mostly just a fancy way of saying “other stuff that is attached”.  The formal term in the Twitterverse is “Annotations”.)

We’ll see how they implement it, but I like the general approach.  The implication is that they aren’t particularly trying to control the attached metadata — they’re just going to allow developers to put stuff into Tweets, to use as they see fit.  As this post discusses, that’s potentially problematic, especially if all the developers go haring off in different directions.  But I don’t actually expect that to happen: frankly, the obvious thing for most sensible developers to do is to develop mime-type standards for the various kinds of metadata, so that it works pretty much the same way email does.  Indeed, I’ll be very surprised if we don’t see mime-based metadata extremely quickly after the Annotations feature rolls out, sometime in the next few months.

Impressions?  What uses do you see for this feature?  What dangers do you see?  (It *is* a potential malware vector, but given the diversity of Twitter clients I actually don’t expect that to be an immediate crisis.)

ETA: I just came across this Ars article, which points to this posting, which gets into more detail about how Annotations will work.  Summary: they’re very open-ended, but small.  You can’t actually embed much in the tweet itself (annotations probably capped at 512 bytes initially, 2k in the long run).  That makes lots of sense, but means that we’ll quickly see an ecosystem evolve around linking things *from* tweets.  For example, I give it weeks, at most, before we see clients integration photo sites with tweets, so that you can do something like take a picture from your phone and just tweet it, with the client saving the photo to a site, putting a link into an annotation, and compatible clients pulling that out and displaying it as if it was simply embedded inside the tweet…

Wave getting better at talking to the rest of the world

March 2, 2010

Yes, yes — one of these days I’ll get back to Catching the Wave.  But in the meantime, keeping up with the news: the Google Wave Developers blog announced today that they’ve come out with a major revision of the Robots API.

To explain what’s going on here, I need to briefly get into the kinds of automata that Wave allows.  There are two types:

  • The obvious ones are gadgets.  A gadget is something that you see on the screen, generally, and tends to be interactive.  So the Google Map embedded in a wave would be a gadget, or the various implementations of thumbs-up / thumbs-down voting.
  • Subtler, but more important for conversations, are robots.  A robot is an external process that can listen to and modify a conversation.  The most obvious and trivial kind of robot that people started playing with initially was the censorbot: one that would keep an eye on the conversation and bleep out bad words.  But there are lots of others doing all sorts of things — and in particular, mediating between Wave conversation and other media like IM.

The new API has all sorts of new capabilities, but two aspects seem most immediately interesting.  First, a robot can pro-actively push data into a conversation.  This implies to me that you can start building really powerful, really interactive bridges between Wave and other services.  For example, an IM bridge becomes considerably more useful when you can send IMs and have them immediately reflected in a persistent Wave.

Second, again most obviously useful for bridging, is the new Proxying-For capability.  Basically, the robot can say that a change is being made on behalf of a specific person.  This opens up *breathtaking* opportunities for abuse, but you really have to trust the robot before you allow it into your wave in the first place anyway.

Put together, it sounds to me like Google is putting a major emphasis behind allowing Wave to interact with outside systems.  That’s smart: they’re trying to make this a key new component of the Internet ecosystem, and the more other parts they can hook into, the more it will be.

What sort of things would you be interested in hooking Wave to?  We’ve seen a lot of examples: pushing bug reports into waves; stockmarket information; IM and email bridges; and so on.  What else?  The sweet spots here appear to be recording transient events, and making it easy to hold conversations around those events as they happen.  I can see lots of obvious applications based simply on email and IM bridging, but I’d bet there are far more off-the-wall possibilities here, like live game scoreboards…