Blogging for Developers Part 2


After my initial post on getting up and running blogging like a developer with Hexo I thought I would follow up my initial post with some more info around AWS Hosting and automated deployments to it, minifying and compressing files before deploying and finally a quick run down on getting robots.txt and a sitemap setup.

Hosting

As far as hosting goes after hunting around I’d found that AWS was a pretty good deal. Using their free tier you can host your entire static site on an S3 instance for no cost so long as you stay within the very generous free quotas of:

  • 5GB storage
  • 20k Get requests
  • 2k Put requests

The other benefit here is that the costs of scaling are fairly small.

For the setup I’d followed this guide as I’ve bought my domains through GoDaddy. Note that the use of Route53 mentioned in the linked guide is not part of the free aws tier, however it equates to $0.51 when all is said an done.

Minifying

To minify out site’s scripts and styles there is the hexo-generator-minify plugin. To install run npm install hexo-generator-minify --save-dev.

With the plugin installed we need to add a new shell task to the GruntFile.js that will call the minifying command of the Hexo CLI.

Make sure to add our new shell:generateMinified task to the deploy task.

GruntFile.js
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
module.exports = function (grunt) {
...
// Project configuration.
grunt.initConfig(
{
...
shell: {
...
generateMinified: {
command: 'hexo gm'
}
}
});
grunt.registerTask('deploy', ['shell:clean', 'shell:generateMinified']);

Compression

Next up is gzip compression. In order for us to serve gzip’d content from an AWS S3 bucket we need to compress the files before deploying them and then specify the ContentEncoding: 'gzip' params for those files when uploading them.

For this we will use grunt-contrib-compress. To install run npm install grunt-contrib-compress --save-dev.

We need to modify our Gruntfile.js to include the compress task as well as the options for compression.

Gruntfile.js
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
module.exports = function (grunt) {
...
grunt.initConfig(
{
...
compress: {
deploy: {
options: {
mode: 'gzip',
pretty: true
},
expand: true,
cwd: 'public/',
src: ['**/*.js', '**/*.html', '**/*.css'],
dest: 'public/compressed'
}
}
});
...
grunt.loadNpmTasks('grunt-contrib-compress');
...
grunt.registerTask('deploy', ['shell:clean', 'shell:generateMinified', 'compress:deploy']);
}

Note that the compress:deploy task has been added to the deploy task and that the output of the compress task is into the public/compressed folder.

Deployment

Turns out that deploying to AWS is fairly trivial via the grunt-aws-s3 plugin.

To set up you will need to install the plugin using npm install grunt-aws-s3 --save-dev.

You will need to get your AWS AccessKeyID and SecretKey. Let’s put these values into a separate file called aws-keys.json that we’ll load in Gruntfile.js.

aws-keys.json
1
2
3
4
{
"AWSAccessKeyId": "XXX",
"AWSSecretKey": "YYY"
}

Next up we need to update the Gruntfile.js with our new aws-keys file and the aws-s3 tasks. We have a task for incremental deployments called aws_s3:deploy and a complete overwrite task called aws_s3:overwrite.

The files lists specified make use of the output of our prior compress step. Make sure you set your uploadConcurrency value as it defaults to 1 which will definitely slow down as your site grows. For a full list of the usage options check out the github page.

Finally make sure that you’ve added the aws_s3:deploy task to you deploy task.

Take note that there are two file sets specified for the deployments. This is so that we can upload our compressed files with the correct ContentEncoding: gzip header and so that we can specify the output path of the compress task. The second file set covers files that are not compressed like images etc.

Gruntfile.js
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
module.exports = function (grunt) {
grunt.initConfig(
{
...
awsCredentials: grunt.file.readJSON('aws-keys.json'),
aws_s3: {
options: {
accessKeyId: '<%= awsCredentials.AWSAccessKeyId %>',
secretAccessKey: '<%= awsCredentials.AWSSecretKey %>',
region: 'your-region-here',
uploadConcurrency: 5,
bucket: 'YourDomainHere.com'
},
deploy: {
differential: true,
files: [
{
expand: true,
cwd: 'public/compressed',
src: ['**/*.js', '**/*.html', '**/*.css'],
params: {ContentEncoding: 'gzip'}
},
{
expand: true,
cwd: 'public/',
src: ['**/*', '!**/*.js', '!**/*.html', '!**/*.css', '!compressed/*']
}
]
},
overwrite: {
files: [
{
expand: true,
cwd: 'public/compressed',
src: ['**/*.js', '**/*.html', '**/*.css'],
params: {ContentEncoding: 'gzip'}
},
{
expand: true,
cwd: 'public/',
src: ['**/*', '!**/*.js', '!**/*.html', '!**/*.css', '!compressed/*']
}
]
}
}
});
...
grunt.loadNpmTasks('grunt-aws-s3');
...
grunt.registerTask('deploy', ['shell:clean', 'shell:generateMinified', 'compress:deploy', 'aws_s3:deploy']);
};

Talking To Robots

Sitemap.xml

A sitemap is an xml file that you can use to give search engines and other webcrawlers a list of specific urls on your site to index. Turns out there’s a plugin for Hexo to generate a sitemap for us that specifies only the content pages of our site and not the secondary pages like categories, archives etc.

Robots.txt

The Robots.txt file is that allows us to instruct web crawlers which pages on our site we don’t want them to index. Note this is not going to prevent nefarious bots from scanning every publicly available page on our site, its merely a means to focus search engines on the content of your site.

We can also point to our site map in this file as an added nicety.

To create this file us for automatically there’s a grunt plugin we can install called grunt-robots-txt.

npm install grunt-robots-txt --save-dev

Next up we need to add the task to our Gruntfile.

Gruntfile.js
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
module.exports = function (grunt) {
var liveDomain = 'christophermandlbaur.com';
...
grunt.initConfig(
{
...
robotstxt: {
deploy: {
dest: 'public/',
policy: [
{ // Allow all crawlers to index all pages on our site.
ua: '*',
disallow: ''
},
{ // Specify the path to our sitemap.xml
sitemap: 'http://' + liveDomain + '/sitemap.xml'
},
{
host: liveDomain
}
]
}
}
});
...
grunt.loadNpmTasks('grunt-robots-txt');
...
grunt.registerTask('deploy', ['shell:clean', 'shell:generate', 'robotstxt:deploy' ,'compress:deploy', 'aws_s3:deploy']);
...
};

Now our robots.txt file will be generated every time we run the deploy task.