My first Alexa skill: control motorized window shades using Flask, a Raspberry Pi, and AWS Lambda

I recently started getting really into home automation and splurged on motorized shades from Hunter Douglas as well as an Amazon Echo Dot. The shades came with a nice physical control device called a “Pebble”, a hub, and a repeater. The setup, hardware, and iPhone app worked really well, but sadly no thanks to hunterdgoulas.com, which is my vote for “most confusing website” of the month. I may be overthinking their user interface, but the word escheresque comes to mind.

After everything was setup, I was giddy to hook the shades up with Alexa, only to find out that I couldn’t. Serena from Hunter Douglas customer service correctly informed me that:

As of right now the PowerView shades can work with Alexa it would just be done directly through IFITT. There is not a direct connection.

As I didn’t want to deal with the sometimes 10 second lag times of IFTTT, my internal response to Serena’s message was: Challenge Accepted! My network, my control!. How hard could it be? (spoiler alert: harder than I thought).

Quick initial research: Alexa has no access to my local network

After some quick googling, I found out that the shade hub has a REST API and, at least, get a response and a list of rooms and settings. My initial thinking: Alexa is connected to my local network, the shade hub is connected to my local network: Perfect! All I need to do is create a small script to have them talk to each other.

That was a dead end. Alexa does not have access to the local network, instead all of her commands need to be sent through the web. So here’s a here’s a list of the pieces I needed.

Necessary pieces to Control Hunter Douglas shades through custom Alexa command

To create your own Alexa skill, I had to put together the following parts:

Find your local shade hub’s networking subnet

If you don’t already know the subnet of your Hunter Douglas hub (and chances are, you won’t), it’s time to scan your local network for available subnets. A handy tool is nmap, which you can install easy using Homebrew: brew install nmap.

For my own case, I knew that I was looking for a subnet in the default range of my router, so I was scanning 192.168.1.*. nmap will list out all the IP addresses that are listening, so you’ll get a list of subnets. Then, you can use send requests to each IP address with any tool, my go-to is Postman. If you send a simple GET request, you’ll get a response if anything is listening. Inspect the response headers and you will find your way to the IP address of your Hunter Douglas hub.

Interact with your local Hunter Douglas hub REST API

I found most of the information about this from this forum post that describes the endpoints of the Hunter Douglas hub. None of them are official or documented, but they’re still easy to work with.

Pretty much everything happens through the /api/scenes endpoint. A GET request to this endpoint will give you all the scenes you have created (scenes is what Hunter Douglas uses for window positions per shade). A request to /api/scenes?sceneid=123456 will move a shade to the according position.

Transform all scenes into your own mapping of scenes and rooms

Since the data we get from the list of scenes is completely unorganized and uses IDs for rooms and shade positions, there’s no way around trying out every single scene and writing down the id. You can create your own JSON payloads, which we will later use in our Flask application. Here’s an abbreviated version:

# Room definition.
rooms = {
  "bedroom": 9999,
  "livingroom": 3333,
}

# Scene definition.
scenes = {
  "bedroom": {
    "halfopen": {
      "id": 24341,
      "roomId": 9999,
    },
    "closed": {
      "id":346345,
      "roomId": 9999,
    }
  },
  "livingroom": {
    "closed": {
      "id": 12323,
      "roomId": 3333,
    }
  }
}

No way to interact with Hunter Douglas REST API through the web

(Jumping ahead a bit:) If there already is a Hunter Douglas REST endpoint available, why don’t we just use it instead of creating our own web server?

Obviously, because it would be way less fun (insert canned laughter here). But also, because there is no official documentation on how to interact with your local hub through the web and no developer SDK available to do that. There is, of course, a way for Hunter Douglas to do that already because their mobile app is talking to my hub. Instead, let’s create our own web server, with a custom API endpoint that we fully control.

Setup your own web server on a Raspberry Pi and create a Flask application with a simple, secure API

I didn’t have a running server at home, and wanted to find a cheap solution, so a Raspberry Pi for $35 on Amazon is a great option to run your own server. I won’t go too much into the details of setting up your Pi, but instead focus on the server and application.

For the application, I picked Flask for its speed and ease of setup. It’s written in Python. I decided to use a simple, encrypted key in the header to authenticate the requests using itsdangerous. The endpoint follows the format /blinds/<room>/<scene> with the name of the room and scene based on our own dictionary.

from flask import Flask, url_for, request, abort
import requests, json
from itsdangerous import URLSafeSerializer

app = Flask(__name__)

# Constant definition.
secret_key = 'yoursecretkey'
header_pass = 'yourpassword'
# Internal IP address for Hunter Douglas hub.
ip = "192.168.1.10"
endpoint = "http://" + ip
print("Hunter Douglas endpoint is {}".format(endpoint))

# Room definition.
rooms = {
  "bedroom": 9999,
  "livingroom": 3333,
}

# Scene definition.
scenes = {
  "bedroom": {
    "halfopen": {
      "id": 24341,
      "roomId": 9999,
    },
    "closed": {
      "id":346345,
      "roomId": 9999,
    }
  },
  "livingroom": {
    "closed": {
      "id": 12323,
      "roomId": 3333,
    }
  }
}

# Find scene for a room and a scene.
def find_scene_data(room, scene):
    return scenes[room][scene]

# Move a shade.
def move_shade(id):
    scene_endpoint = endpoint + "/api/scenes?sceneid=" + str(id)
    r = requests.get(scene_endpoint)
    data = json.loads(r.text)
    if (len(data['scene']['shadeIds']) > 0):
        return True
    else:
        return False

def authenticate_request(hash):
    try:
        s = URLSafeSerializer(secret_key)
        decrypted = s.loads(hash)
        if (decrypted == header_pass):
            return True
        else:
            return False
    except:
        print('Hash not valid.')
        return False

def verify_trusted_source():
    if not authenticate_request(request.headers.get('Auth-Key')):
        abort(403)

@app.route('/blinds/<room>/<scene>')
def blinds(room, scene):
    verify_trusted_source()
    try:
        scene_data = find_scene_data(room, scene)
    except KeyError:
        return "Scene was not found in this room."

    success = move_shade(scene_data['id'])
    print(success)
    return "Moving the blind {} in the {}. using this data: {}".format(scene, room, scene_data)

if __name__ == '__main__':
    app.run(debug=True, host='0.0.0.0')

Problems with this application: too many hardcoded values

There are two issues with the application as it is.

First, the local IP address of the Hunter Douglas hub is hardcoded. Unfortunately, restarting your router may result in a different IP address for the hub, which requires this code to be adjusted.

Secondly, the scene data is hardcoded. This is no problem if you have already defined all the scenes via the Hunter Douglas mobile app. But if you change your scenes after the fact, you need to update your scenes/rooms dictionary with the new IDs.

Running your application in debug mode and production mode

To test the application, we can use the pre-installed version of Python 3 on our Raspberry Pi using a simple command (assuming you saved the whole application in a file called app.py):

sudo python3 app.py

This works great for debugging any issues with the application, but we need to setup an actual web server that proxies the requests to our Python application. For that, I’m setting up an NGINX server with uWGSI. The setup is relatively straightforward. You can just follow this tutorial: Python Flask Web Application on Raspberry Pi with NGINX and uWSGI.

The service describes how to launch uWSGI on boot, but in some cases, you might have to manually make a change and restart your uWSGI server, which you can do with this command:

/usr/local/bin/uwsgi --ini /home/pi/code/hdapp/uwsgi_config.ini --uid www-data --gid www-data --daemonize /var/log/uwsgi.log

Modify router NAT rules to forward external traffic to Raspberry Pi

We now have a running web server on our Raspberry Pi, but until now we can’t actually connect to it from the outside world. In order for that to change, we need create a rule on our router that forwards traffic from a specific port to our web server on the Raspberry Pi. This doesn’t come without security implications, so if you open up your home network to the web, make sure you know what that means.

Here’s an example for a TP-Link router, which I may or may not own: TP-Link Router Port Forwarding | Support | No-IP. Basically, you’re going expose a specific port on your router, and forward the traffic to an internal IP address and port. Then, if you’re lucky and have a static public IP address associated with your network, you can use that public IP address and port.

Here’s an example: Your network’s public static IP address is 84.14.11.100. You expose the port 5200 and forward it to the local IP address of your NGINX server 192.168.1.100 and the port you’re running your uWSGI server at 3000.

This means we can send a request through the web to https://84.14.11.100:5200, and that request will finds its way to our local Raspberry Pi’s Flask application running on uWSGI. This (https://84.14.11.100:5200) will be the endpoint we use for our Lambda function.

Create the actual Alexa skill

In order to setup your own skill, you should login with your Amazon developer account at Amazon Developer Services. For some reason, Alexa doesn’t live within the regular AWS console, but in it’s own developer section. Keeping it interesting, AWS!

Once logged in, follow the instructions to create a custom Alexa skill, as defined here: Build Skills with the Alexa Skills Kit – Amazon Apps & Games Developer Portal. For our custom skill, we use 2 variables, or as Amazon calls them: slots. I had challenges using the predefined AMAZON.Room slot for rooms, as it was never able to detect “living room” correctly. Creating your own slots and putting everything together is relatively straightforward using the Interaction Model Builder, which is currently in beta.

Invocation Name for your Alexa dev skill: You’re limited to “ask Alexa”

Amazon distinguishes between Alexa skills in dev mode, and published Alexa skills. The only account that can use a dev skill is your own, whereas published skills can be used by anyone. However, publishing a skill and getting it through Amazon’s compliance is a whole different ball game which I will skip altogether.

There’s one big issue when creating a dev skill, which isn’t quite apparent from the Amazon documentation: The skill seems to be limited to an invocation name that starts with Ask Alexa to do something. So skills such as Alexa, close the blinds, or Alexa, open the blinds won’t work with a dev skill.

With this limitation in mind, I ended on this phrase: Ask the window blinds to <position> in the <room>, Not exactly how I would have worded this, but I can deal with it.

Final interaction model for window blinds Alexa skill

So here’s my current version for the skill. It doesn’t handle errors quite well yet, but works in most cases.

{
  "intents": [
    {
      "name": "AMAZON.CancelIntent",
      "samples": []
    },
    {
      "name": "AMAZON.HelpIntent",
      "samples": []
    },
    {
      "name": "AMAZON.StopIntent",
      "samples": []
    },
    {
      "name": "moveBlinds",
      "samples": [
        "ask the window blinds to {position} in the {room}",
        "ask the window blinds to {position} in {room}",
        "ask window blinds to {position} in {room}",
        "ask window blinds to {position} in the {room}"
      ],
      "slots": [
        {
          "name": "position",
          "type": "position",
          "samples": []
        },
        {
          "name": "room",
          "type": "room",
          "samples": []
        }
      ]
    }
  ],
  "types": [
    {
      "name": "position",
      "values": [
        {
          "id": "closed",
          "name": {
            "value": "closed",
            "synonyms": [
              "close"
            ]
          }
        },
        {
          "id": "halfopen",
          "name": {
            "value": "half open",
            "synonyms": [
              "half closed",
              "half way"
            ]
          }
        },
        {
          "id": "freshair",
          "name": {
            "value": "fresh air",
            "synonyms": []
          }
        },
        {
          "id": "open",
          "name": {
            "value": "open",
            "synonyms": []
          }
        }
      ]
    },
    {
      "name": "room",
      "values": [
        {
          "id": "bedroom",
          "name": {
            "value": "bedroom",
            "synonyms": []
          }
        },
        {
          "id": "livingroom",
          "name": {
            "value": "living room",
            "synonyms": []
          }
        }
      ]
    }
  ]
}

Entity resolution to map different variations of variables to a single identifier

An Alexa skill gives us the option to define slots along with synonyms and a single identifier. We could define an identifier as “halfopen”, but a user might call this “half open” or “half closed”. For our custom Raspberry PI endpoint, we need that single identifier instead of the variations. The Alexa skill will provide us with that data and we can parse it in our Lambda function.

The big caveat here is that the skill simulator will allow us to test the skill, but it won’t provide us with the entity resolution data. So when we look at our CloudWatch logs to see what data was sent, we will see different data than when we try the skill on an actual Alexa device. A good testing tool is Alexa Skill Testing Tool – Echosim.io, which allows us to use our laptop to create a virtual Alexa device. In here, we should see the exact same results as when using a physical Alexa device.

AWS Lambda function that will serve as the “serverless” endpoint for the Alexa skill

The recommended solution for an Alexa skill endpoint is to use a Lambda function. AWS Lambda allows you to run code in a server environment that you pay for by the millisecond. Instead of spinning up a server, you just define the code and then run it in a micro-environment provided by AWS that just exists for several seconds.

The Lambda function represents the glue between the Alexa skill and our Raspberry Pi server. It will parse the event that it gets sent from the Alexa skill and forward the request to the Raspberry Pi endpoint. The URL and key are defined as Lambda environment variables so we don’t have to hardcode them in the code.

Lambda Function

Lambda functions can be written in Java, Node, and Python. Since the rest of our application is already written in Python, I decided to stick with a single language for all applications. Here’s the function:

import os
from datetime import datetime
import urllib.request

# URL and user agent, stored in Lambda environment variables.
URL = os.environ['url']
USER_AGENT = os.environ['user_agent']
AUTH_KEY = os.environ['auth_key']

# Define default error response.
default_error = {
    'version': '1.0',
    'response': {
        'outputSpeech': {
            'type': 'PlainText',
            'text': 'What you said made no sense to me.',
        }
    }
}

# Define default success message.
default_success = {
  'version': '1.0',
  'response': {
      'outputSpeech': {
          'type': 'PlainText',
          'text': 'You just moved your blinds. Congratulations!',
     }
  }  
}

# Determine the target values for the slots.
def get_target_values(slots):
    target_values = {}
    expected_keys = ['position', 'room']
    for key in expected_keys:
        slot = slots[key]
        if slot.get("resolutions", {}):
            # Resolution authorities and resolutions are lists,
            # but we are just using the first option.
            if slot['resolutions']['resolutionsPerAuthority'][0]['status']['code'] == "ER_SUCCESS_MATCH":
                target_values[key] = slot['resolutions']['resolutionsPerAuthority'][0]['values'][0]['value']['id']
        else:
            target_values[key] = slot['value']

    return target_values

# Lambda handler that will be called during lambda execution.
def lambda_handler(event, context):
    print(event)

    # Get slot values.
    slots = event['request']['intent'].get("slots", {})
    if not slots:
        return default_error

    # Get target values for slots.
    target_values = get_target_values(slots)
    target_url = URL + target_values['room'] + '/' + target_values['position']
    print(target_url)
    
    try:
        req = urllib.request.Request(target_url,data=None,headers={
            'Auth-Key': AUTH_KEY
        })
        res = urllib.request.urlopen(req)
        if (res.status != 200):
            raise Exception('Call failed')
    except:
        print('Connection failed!')
        raise
    else:
        print('Connection succeeded!')
        return default_success
    finally:
        print('Completed at {}'.format(str(datetime.now())))

Final thoughts

This was a fun mini-project that increased my appetite for creating new, custom Alexa skills. Now that most of the heavy lifting is done (setting up the uWSGI server, exposing it to the web, securing the endpoint, figuring out what’s possible for Alexa dev skills), creating the next custom skill should be much faster. Maybe some sort of integration with Amazon Rekognition – Deep learning-based image analysis.