Using Google Guava Cache for local caching

Lot of times we would have to fetch the data from a database or another webservice or load it from file system. In cases where it involves a network call there would be inherent network latencies, network bandwidth limitations. One of the approaches to overcome this is to have a cache local to the application.

If your application spans across multiple nodes then the cache will be local to each node causing inherent data inconsistency. This data inconsistency can be traded off for better throughput and lower latencies. But sometimes if the data inconsistency makes a significant difference then one can reduce the ttl (time to live) for the cache object thereby reducing the duration for which the data inconsistency can occur.

Among a number of approaches of implementing local cache, one which I have used in a high load environment is Guava cache. We used guava cache to serve 80,000+ requests per second. And the 90th percentile of the latencies were ~5ms. This helped us scale with the limited network bandwidth requirements.

In this post I will show how one can add a layer of Guava cache in order to avoid frequent network calls. For this I have picked a very simple example of fetching details of a book given its ISBN using the Google Books API.

A sample request for fetching book details using ISBN13 string is:
https://www.googleapis.com/books/v1/volumes?q=isbn:9781449370770&key=API_KEY

The part of response which is useful for us looks like:
SampleResponse

A very detailed explanation on the features of Guava Cache can be found here. In this example I would be using a LoadingCache. The LoadingCache takes in a block of code which it uses to load the data into the cache for missing key. So when you do a get on cache with an non existent key, the LoadingCache will fetch the data using the CacheLoader and set it in cache and return it to the caller.

Lets now look at the model classes we would need for representing the book details:

  • Book class
  • Author class

The Book class is defined as:

//Book.java
package info.sanaulla.model;

import java.util.ArrayList;
import java.util.Date;
import java.util.List;

public class Book {
  private String isbn13;
  private List<Author> authors;
  private String publisher;
  private String title;
  private String summary;
  private Integer pageCount;
  private String publishedDate;

  public String getIsbn13() {
    return isbn13;
  }

  public void setIsbn13(String isbn13) {
    this.isbn13 = isbn13;
  }

  public List<Author> getAuthors() {
    return authors;
  }

  public void setAuthors(List<Author> authors) {
    this.authors = authors;
  }

  public String getPublisher() {
    return publisher;
  }

  public void setPublisher(String publisher) {
    this.publisher = publisher;
  }

  public String getTitle() {
    return title;
  }

  public void setTitle(String title) {
    this.title = title;
  }

  public String getSummary() {
    return summary;
  }

  public void setSummary(String summary) {
    this.summary = summary;
  }

  public void addAuthor(Author author){
    if ( authors == null ){
      authors = new ArrayList<Author>();
    }
    authors.add(author);
  }

  public Integer getPageCount() {
    return pageCount;
  }

  public void setPageCount(Integer pageCount) {
    this.pageCount = pageCount;
  }

  public String getPublishedDate() {
    return publishedDate;
  }

  public void setPublishedDate(String publishedDate) {
    this.publishedDate = publishedDate;
  }
}

And the Author class is defined as:

//Author.java
package info.sanaulla.model;

public class Author {

  private String name;

  public String getName() {
    return name;
  }

  public void setName(String name) {
    this.name = name;
  }

Lets now define a service which will fetch the data from the Google Books REST API and call it as BookService. This service does the following:

  1. Fetch the HTTP Response from the REST API.
  2. Using Jackson’s ObjectMapper to parse the JSON into a Map.
  3. Fetch relevant information from the Map obtained in step-2.

I have extracted out few operations from the BookService into an Util class namely:

  1. Reading the application.properties file which contains the Google Books API Key (I haven’t committed this file to git repository. But one can add this file in their src/main/resources folder and name that file as application.properties and the Util API will be able to read it for you)
  2. Making an HTTP request to REST API and returning the JSON response.

The below is how the Util class is defined:

//Util.java
 
package info.sanaulla;

import com.fasterxml.jackson.databind.ObjectMapper;

import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.net.HttpURLConnection;
import java.net.ProtocolException;
import java.net.URL;
import java.util.ArrayList;
import java.util.List;
import java.util.Properties;

public class Util {

  private static ObjectMapper objectMapper = new ObjectMapper();
  private static Properties properties = null;

  public static ObjectMapper getObjectMapper(){
    return objectMapper;
  }

  public static Properties getProperties() throws IOException {
    if ( properties != null){
        return  properties;
    }
    properties = new Properties();
    InputStream inputStream = Util.class.getClassLoader().getResourceAsStream("application.properties");
    properties.load(inputStream);
    return properties;
  }

  public static String getHttpResponse(String urlStr) throws IOException {
    URL url = new URL(urlStr);
    HttpURLConnection conn = (HttpURLConnection) url.openConnection();
    conn.setRequestMethod("GET");
    conn.setRequestProperty("Accept", "application/json");
    conn.setConnectTimeout(5000);
    //conn.setReadTimeout(20000);

    if (conn.getResponseCode() != 200) {
      throw new RuntimeException("Failed : HTTP error code : "
              + conn.getResponseCode());
    }

    BufferedReader br = new BufferedReader(new InputStreamReader(
          (conn.getInputStream())));

    StringBuilder outputBuilder = new StringBuilder();
    String output;
    while ((output = br.readLine()) != null) {
      outputBuilder.append(output);
    }
    conn.disconnect();
    return outputBuilder.toString();
  }
}

And So our Service class looks like:

//BookService.java
package info.sanaulla.service;

import com.fasterxml.jackson.databind.ObjectMapper;
import com.google.common.base.Optional;
import com.google.common.base.Strings;

import info.sanaulla.Constants;
import info.sanaulla.Util;
import info.sanaulla.model.Author;
import info.sanaulla.model.Book;

import java.io.IOException;
import java.text.ParseException;
import java.text.SimpleDateFormat;
import java.util.Arrays;
import java.util.List;
import java.util.Map;
import java.util.Properties;

public class BookService {

  public static Optional<Book> getBookDetailsFromGoogleBooks(String isbn13) throws IOException{
    Properties properties = Util.getProperties();
    String key = properties.getProperty(Constants.GOOGLE_API_KEY);
    String url = "https://www.googleapis.com/books/v1/volumes?q=isbn:"+isbn13;
    String response = Util.getHttpResponse(url);
    Map bookMap = Util.getObjectMapper().readValue(response,Map.class);
    Object bookDataListObj = bookMap.get("items");
    Book book = null;
    if ( bookDataListObj == null || !(bookDataListObj instanceof List)){
      return Optional.fromNullable(book);
    }

    List bookDataList = (List)bookDataListObj;
    if ( bookDataList.size() < 1){
      return Optional.fromNullable(null);
    }

    Map bookData = (Map) bookDataList.get(0);
    Map volumeInfo = (Map)bookData.get("volumeInfo");
    book = new Book();
    book.setTitle(getFromJsonResponse(volumeInfo,"title",""));
    book.setPublisher(getFromJsonResponse(volumeInfo,"publisher",""));
    List authorDataList = (List)volumeInfo.get("authors");
    for(Object authorDataObj : authorDataList){
      Author author = new Author();
      author.setName(authorDataObj.toString());
      book.addAuthor(author);
    }
    book.setIsbn13(isbn13);
    book.setSummary(getFromJsonResponse(volumeInfo,"description",""));
    book.setPageCount(Integer.parseInt(getFromJsonResponse(volumeInfo, "pageCount", "0")));
    book.setPublishedDate(getFromJsonResponse(volumeInfo,"publishedDate",""));

    return Optional.fromNullable(book);
  }

  private static String getFromJsonResponse(Map jsonData, String key, String defaultValue){
    return Optional.fromNullable(jsonData.get(key)).or(defaultValue).toString();
  }
}

Adding caching on top of the Google Books API call

We can create a cache object using the CacheBuilder API provided by Guava library. It provides methods to set properties like

  • maximum items in cache,
  • time to live of the cache object based on its last write time or last access time,
  • ttl for refreshing the cache object,
  • recording stats on the cache like how many hits, misses, loading time and
  • providing a loader code to fetch the data in case of cache miss or cache refresh.

So what we would ideally want is that a cache miss should invoke our API written above i.e getBookDetailsFromGoogleBooks. And we would want to store maximum of 1000 items and expire the items after 24 hours. So the piece of code which builds the cache looks like:

private static LoadingCache<String, Optional<Book>> cache = CacheBuilder.newBuilder()
  .maximumSize(1000)
  .expireAfterAccess(24, TimeUnit.HOURS)
  .recordStats()
  .build(new CacheLoader<String, Optional<Book>>() {
      @Override
      public Optional<Book> load(String s) throws IOException {
          return getBookDetailsFromGoogleBooks(s);
      }
  });

Its important to note that the maximum items which you want to store in the cache impact the heap used by your application. So you have to carefully decide this value depending on the size of each object you are going to cache and the maximum heap memory allocated to your application.

Lets put this into action and also see how the cache stats report the stats:

package info.sanaulla;

import com.google.common.cache.CacheStats;
import info.sanaulla.model.Book;
import info.sanaulla.service.BookService;

import java.io.IOException;
import java.util.Properties;
import java.util.concurrent.ExecutionException;

public class App 
{
  public static void main( String[] args ) throws IOException, ExecutionException {
    Book book = BookService.getBookDetails("9780596009205").get();
    System.out.println(Util.getObjectMapper().writeValueAsString(book));
    book = BookService.getBookDetails("9780596009205").get();
    book = BookService.getBookDetails("9780596009205").get();
    book = BookService.getBookDetails("9780596009205").get();
    book = BookService.getBookDetails("9780596009205").get();
    CacheStats cacheStats = BookService.getCacheStats();
    System.out.println(cacheStats.toString());
  }
}
[/cpde]

And the output we would get is:

{"isbn13":"9780596009205","authors":[{"name":"Kathy Sierra"},{"name":"Bert Bates"}],"publisher":""O'Reilly Media, Inc."","title":"Head First Java","summary":"An interactive guide to the fundamentals of the Java programming language utilizes icons, cartoons, and numerous other visual aids to introduce the features and functions of Java and to teach the principles of designing and writing Java programs.","pageCount":688,"publishedDate":"2005-02-09"}
CacheStats{hitCount=4, missCount=1, loadSuccessCount=1, loadExceptionCount=0, totalLoadTime=3744128770, evictionCount=0}

This is a very basic usage of Guava cache and I wrote it as I was learning to use this. In this I have made use of other Guava APIs like Optional which helps in wrapping around existent or non-existent(null) values into objects. This code is available on git hub- https://github.com/sanaulla123/Guava-Cache-Demo. There will be concerns such as how it handles concurrency which I havent gone detail into. But under the hood it uses a segmented Concurrent hash map such that the gets are always non-blocking, but the number of concurrent writes would be decided by the number of segments.

Some of the useful links related to this:
http://guava-libraries.googlecode.com/files/ConcurrentCachingAtGoogle.pdf

Advertisements

One thought on “Using Google Guava Cache for local caching

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s