Deploying a Typesense Instance on EC2
Recently, I needed to switch over a search engine from a Django ORM + Postgres Vector search to a more robust system. After deliberating with paid options, I ended up choosing Typesense as an open source alternative.
I was basically looking to do the following:
- dramatically increase speed
- have good typo tolerance
- support custom filtering and faceting
Algolia was actually my top pick, but the pricing was absolutely crazy. The AI elements would have been extremely nice for our users but its just not in the budget.
Since I went with the open source option, I decided to deploy my own instance directly to the exsiting applications VPC. The app is hosted on AWS, and I decided to use cloudformation to spin up the infrastructure and open everything up.
I’m a HUGE fan of Pulumi for IaC, but wanted to try cloudformation because sometimes pain is good.
Here is a quick guide on how I quickly got up and running. This isn’t a perfect configuration, as its just one instance and will require making sure downtime and upgrades to the box don’t interfere with users who might be searching the records.
If you just want to copy paste, go here -> [GitHub Repository]
Security groups
Firstly, I needed an security group for the instance
InternalAppSecurityGroup:
Type: AWS::EC2::SecurityGroup
Properties:
GroupDescription: "SG for typesense instance"
VpcId: !Ref VPC
Tags:
- Key: Name
Value: "InternalApp-SG"
Next, I created a security group specifically for Typesense that only allows access from the internal application security group on port 8108 (Typesense’s default port) and SSH access
I use a bastion to connect, but leaving the SSH port open is not fine. This was in a staging environment, for production you 100% do not want this open.
TypesenseSecurityGroup:
Type: AWS::EC2::SecurityGroup
Properties:
GroupDescription: "Allow access to Typesense only from other applications in same VPC"
VpcId: !Ref VPC
SecurityGroupIngress:
- IpProtocol: tcp
FromPort: 8108
ToPort: 8108
SourceSecurityGroupId: !Ref InternalAppSecurityGroup
- IpProtocol: tcp
FromPort: 22
ToPort: 22
CidrIp: 0.0.0.0/0
EC2 Instance
For the EC2 instance itself, I just used a cheap box and added an EBS volume for storage
TypesenseInstance:
Type: AWS::EC2::Instance
Properties:
InstanceType: t4g.micro
ImageId: !Ref LatestAmiId
BlockDeviceMappings:
- DeviceName: "/dev/xvda"
Ebs:
VolumeSize: 20
VolumeType: gp3
- DeviceName: "/dev/xvdb"
Ebs:
VolumeSize: 50
VolumeType: gp3
DeleteOnTermination: false
The UserData script handles all the instance setup:
- Updates the system and installs Docker
- Formats and mounts the additional EBS volume
- Runs the Typesense Docker container with the specified API key
There is a bit of a rabbithole to go down, if the API key generated like this is really secure. Again, for production, do what you need to do.
UserData:
Fn::Base64: !Sub |
#!/bin/bash
TypesenseApiKey="${TypesenseApiKey}"
sudo yum update -y
sudo yum install -y docker
sudo systemctl start docker
sudo systemctl enable docker
# Format and mount additional EBS volume
sudo mkfs -t xfs /dev/xvdb
sudo mkdir -p /data
sudo mount /dev/xvdb /data
echo "/dev/xvdb /data xfs defaults 0 0" | sudo tee -a /etc/fstab
docker run -d \
-p 8108:8108 \
-v /data:/data \
typesense/typesense:27.1 \
--api-key="$TypesenseApiKey" \
--data-dir /data
The template accepts several parameters to make it reusable across different environments:
- VPC and subnet IDs
- EC2 key pair name for SSH access
- AMI ID (defaults to Amazon Linux 2023)
- Typesense API key
Finally, the template outputs the private IP address of the instance, which you’ll need to configure your application to connect to Typesense:
Outputs:
TypesensePrivateIP:
Description: "Private IP of the Typesense EC2 instance"
Value: !GetAtt TypesenseInstance.PrivateIp
Deployment
Heres an example on how to deploy this template from your local machine using the aws CLI.
aws cloudformation create-stack --stack-name TypesenseStack \
--template-body file://typesense-stack.yaml \
--parameters ParameterKey=VPC,ParameterValue=vpc-some-id \
ParameterKey=PrivateSubnet,ParameterValue=subnet-some-subnet \
ParameterKey=KeyPairName,ParameterValue=some-keypair \
ParameterKey=TypesenseApiKey,ParameterValue=some-random-key
Remember that this setup is suitable for development and testing but prod requires extra steps:
- Setting up monitoring, alerts and restarts
- Implementing backup strategies for typesense in case you care about indexing loss, personally I am going to be reindexing daily using an 0 downtime alias method described in their docs
- Adding high availability through multiple instances
- Proper security hardening (less access, dynamicly rotated api key, strict SSH rules, server guarded by an ogre, etc)