How to use a single AWS Lambda for both Alexa Skills Kit and API.AI?

2018-06-14 02:24:21

In the past, I have setup two separate AWS lambdas written in Java. One for use with Alexa and one for use with Api.ai. They simply return "Hello world" to each assitant api. So although they are simple they work. As I started writing more and more code for each one, I started to see how similar my java code was and I was just repeating myself by having two separate lambdas.

Fast forward to today.

What I'm working on now is having a single AWS lambda that can handle input from both Alexa and Api.ai but I'm having some trouble. Currently, my thought is that when the lambda is run, there would be a simple if statement like so:

The following is not real code, just what I think I can do in my head

if (figureOutIfInputType.equals("alexa")){
runAlexaCode();
} else if (figureOutIfInputType.equals("api.ai")){
runApiAiCode();
}

The thing is now I need to somehow tell if the function is being called by an alexa or api.ai.

This is my actual java right now:

public class App implements RequestHandler<Object, String> {

  @Override
  public String handleRequest(Object input, Context context) {
    System.out.println("myLog: " + input.toString());

      return "Hello from AWS";
  }

I then ran the lambda from Alexa and Api.ai to see what Object input would get generated in java.

API.ai

{id=asdf-6801-4a9b-a7cd-asdffdsa, timestamp=2017-07-
28T02:21:15.337Z, lang=en, result={source=agent, resolvedQuery=hi how 
are you, action=, actionIncomplete=false, parameters={}, contexts=[], 
metadata={intentId=asdf-3a2a-49b6-8a45-97e97243b1d7, 
webhookUsed=true, webhookForSlotFillingUsed=false, 
webhookResponseTime=182, intentName=myIntent}, fulfillment=
{messages=[{type=0, speech=I have failed}]}, score=1}, status=
{code=200, errorType=success}, sessionId=asdf-a7ac-43c8-8ae8-
bc1bf5ecaad0}

Alexa

{version=1.0, session={new=true, sessionId=amzn1.echo-api.session.asdf-
7e03-4c35-9d98-d416eefc5b23, application=    
{applicationId=amzn1.ask.skill.asdf-a02e-4938-a747-109ea09539aa}, user=        
{userId=amzn1.ask.account.asdf}}, context={AudioPlayer=
{playerActivity=IDLE}, System={application=
{applicationId=amzn1.ask.skill.07c854eb-a02e-4938-a747-109ea09539aa}, 
user={userId=amzn1.ask.account.asdf}, device=
{deviceId=amzn1.ask.device.asdf, supportedInterfaces={AudioPlayer={}}}, 
apiEndpoint=https://api.amazonalexa.com}}, request={type=IntentRequest, 
requestId=amzn1.echo-api.request.asdf-5de5-4930-8f04-9acf2130e6b8, 
timestamp=2017-07-28T05:07:30Z, locale=en-US, intent=
{name=HelloWorldIntent, confirmationStatus=NONE}}}

So now I have both my Alexa and Api.ai output, and they're different. So that's good. I'll be able to tell which one is which. but I'm stuck. I'm not really sure if I should try to create an AlexaInput object and an ApiAIinput object.

Am I doing this all wrong? Am I wrong with trying to have one lambda fulfill my "assistant" requests from more than one service (Alexa and ApiAI)?

Any help would be appreciated. Surely, someone else must be writing their assistant functionality in AWS and wants to reuse their code for both "assistant" platforms.

I had the same question and same thought, but as I got further and further in implementing, I realized that it wasn't quite practical for one big reason:

While a lot of my logic needed to be the same - the format of the results was different. Sometimes, even the details or formatting of the results would be different.

What I did was go back to some concepts that were familiar in web programming by dividing it into two parts:

A back-end system that was responsible for taking parameters and applying the business logic to produce results. These results would be fairly low-level, not entire phrases, but more a set of keys/value pairs that indicated what kind of result to give and what values would be needed in that result.

A front-end system that was responsible for handling things that were Alexa/Assistant specific. So it would take the request, extract parameters and state, call the back-end system with this information, get a result back which included what kind of reply to send and the values needed, and then format the exact phrase (and any other supporting info, such as a card or whatever) and put it into a properly formatted response.

The front-end components would be a different lambda function for each agent type, mostly to make the logic a little cleaner. The back-end components can either be a library function or another lambda function, whatever makes the most sense for the task, but is independent of the front-end implementation.

I suppose one could also this by having an abstract parent class that implements the back-end logic, and having the front-end logic be subclasses of this. I wouldn't do it this way because it doesn't provide as clear an interface boundary between the two, but its not unreasonable.

You can achieve the result (code reuse) a different way.

Firstly, create a method for each type of event (Alexa, API Gateway, etc) using the aws-lambda-java-events library. Some information here: http://docs.aws.amazon.com/lambda/latest/dg/java-programming-model-handler-types.html

Each entry point method should deal with the semantics of the event triggering it (API Gateway) and call into common code to give you code reuse.

Secondly, upload your JAR/ZIP to an S3 bucket.

Thirdly, for each event you want to handle - create a Lambda function, referencing the same ZIP/JAR in the S3 bucket and specifying the relevant entry point.

This way, you'll get code reuse without having to juggle multiple copies of the code on AWS, albeit at the cost of having multiple Lambdas defined.

There's a great tool that supports working this way called Serverless Framework which I'd highly recommend looking at: https://serverless.com/framework/docs/providers/aws/

I've been using a single Lambda to handle Alexa ASK and Microsoft Luis.ai responses. I'm using Python instead of Java but the idea is the same and I believe that using an AlexaInput and ApiAIinput object, both extending the same interface should be the way to go.

I first use the context information to identify where the request is coming from and parse it into the appropriate object (I use a simple nested dictionary). Then pass this to my main processing function and finally, pass the output to a formatter again based on the context. The formatter will be aware of what you need to return. The only caveat is that handling session information; which in my case I serialize to my own DynamoDB table anyway.

链接地址: http://www.djcxy.com/p/40148.html

上一篇: 大O是衡量内存需求还是速度？

下一篇: 如何将一个AWS Lambda用于Alexa Skills Kit和API.AI？