Programmer: Lifelong Learning: January 2017

Eclipse - Run Code Clean Up Manually + Save Action

The Scenario
Eclipse shows compiler warnings in Problems view.
Some may be trivial such as unused import, but some may be more serious such as null access.

But if we don't fix trivial issues, there may be too many warnings in the project; This may cause us just ignore all these warnings, which can lead us ignore vital/important warnings and potential bugs.

So usually I don't like see any compile warning in current editor or the whole project - We can find this in Problems view.

How Eclipse can Help
First we can configure compiler at Preferences -> Java -> Compiler -> Errors/Warnings.

Save Action
We can configure Eclipse "Save Action" at Java -> Editor -> Save Action to auto format code, organize imports and a lot of things.
- We can also configure save action for Javascript and Scala or other langs.

But sometimes, when we only modify a few lines of the file, we don't want to change other parts otherwise when others review the change, it's difficult for them to figure out what changed.

So usually I only configure "Save Action" to format edited lines and organize imports.

We can also configure General -> Editors -> AutoSave to save dirty editors every X seconds.

Run Code Cleanup Manually
First we assign a shortcut key such as Ctrl+Alt+Command+C in Preferences(Command+,) -> Key
- We can also configure this for Javascript.

Then we configure what Code Clean Up does at Preferences -> Java -> Code Style -> Clean Up

It can do things(more than 20) such as format code, organize imports, add @Override, final, serial ID, add unimplemented methods, remove trailing space, correct indentation and much more.

If I change most part of the current file, or I think it's necessary, I will click Ctrl+Alt+Command+C to tun Code Clean Up manually.

Caching Data in Spring Using Redis

The Scenario
We would like to cache Cassandra data to Redis for better read performance.

Cache Configuration
To make data in Redis more readable and easy for troubleshooting and debugging, we use GenericJackson2JsonRedisSerializer to serialize value as Json data in Redis, use StringRedisSerializer to serialize key.

To make GenericJackson2JsonRedisSerializer work, we also configure objectMapper to store type info: objectMapper.enableDefaultTyping(ObjectMapper.DefaultTyping.NON_FINAL, JsonTypeInfo.As.PROPERTY);
- as we need store class info.
- The data would be like: {\"@class\":\"com....Configuration\",\"name\":\"configA\",\"value\":\"805\"}

We also configure objectMapper: configure(SerializationFeature.FAIL_ON_EMPTY_BEANS, false).
- As when there Cassandra, we want to cache this info to redis. So later we can return from redis cache, and no need to read from database again.
- Spring-cache stores org.springframework.cache.support.NullValue in this case, its json data is {}. We need configure ObjectMapper to return empty object with no properties. - By default it throws exception:
org.springframework.data.redis.serializer.SerializationException: Could not write JSON: No serializer found for class org.springframework.cache.support.NullValue and no properties discovered to create BeanSerializer (to avoid exception, disable SerializationFeature.FAIL_ON_EMPTY_BEANS); nested exception is com.fasterxml.jackson.databind.JsonMappingException: No serializer found for class org.springframework.cache.support.NullValue and no properties discovered to create BeanSerializer (to avoid exception, disable SerializationFeature.FAIL_ON_EMPTY_BEANS)

We configured cacheManager to store null value.
- We use the configuration-driven approach and have a lot of configurations; We define default configuration values in property files. In the code, it first read from db and if null then read from property files. This makes us want to cache null value.

We also use SpEL to set different TTL for different cache.
redis.expires={configData:XSeconds, userSession: YSeconds}


@Configuration
@EnableCaching
public class RedisCacheConfig extends CachingConfigurerSupport {
    @Value("${redis.hostname}")
    String redisHostname;
    @Value("${redis.port}")
    int redisPort;
    @Value("#{${redis.expires}}")
    private Map<String, Long> expires;
    @Bean
    public JedisConnectionFactory redisConnectionFactory() {
        final JedisConnectionFactory redisConnectionFactory = new JedisConnectionFactory();
        redisConnectionFactory.setHostName(redisHostname);
        redisConnectionFactory.setPort(redisPort);
        redisConnectionFactory.setUsePool(true);
        return redisConnectionFactory;
    }
    @Bean("redisTemplate")
    public RedisTemplate<String, Object> genricJacksonRedisTemplate(final JedisConnectionFactory cf) {
        final RedisTemplate<String, Object> redisTemplate = new RedisTemplate<>();
        redisTemplate.setKeySerializer(new StringRedisSerializer());
        redisTemplate.setHashKeySerializer(new StringRedisSerializer());
        redisTemplate.setValueSerializer(new GenericJackson2JsonRedisSerializer(createRedisObjectmapper()));
        redisTemplate.setHashValueSerializer(new GenericJackson2JsonRedisSerializer(objectMapper));
        redisTemplate.setConnectionFactory(cf);
        return redisTemplate;
    }
    @Bean
    public CacheManager cacheManager(final RedisTemplate<String, Object> redisTemplate) {
        final RedisCacheManager cacheManager =
                new RedisCacheManager(redisTemplate, Collections.<String>emptyList(), true);
        cacheManager.setDefaultExpiration(86400);
        cacheManager.setExpires(expires);
        cacheManager.setLoadRemoteCachesOnStartup(true);
        return cacheManager;
    }

    public static ObjectMapper createRedisObjectmapper() {
        final SimpleDateFormat sdf = new SimpleDateFormat(DEFAULT_DATE_FORMAT, Locale.ROOT);
        sdf.setTimeZone(TimeZone.getTimeZone("UTC"));
        final SimpleModule dateModule = (new SimpleModule()).addDeserializer(Date.class, new JsonDateDeserializer());
        return new ObjectMapper()
                .enableDefaultTyping(ObjectMapper.DefaultTyping.NON_FINAL,JsonTypeInfo.As.PROPERTY)//\\
                .registerModule(dateModule).setDateFormat(sdf)
                .configure(DeserializationFeature.ACCEPT_SINGLE_VALUE_AS_ARRAY, true)
                .configure(DeserializationFeature.UNWRAP_SINGLE_VALUE_ARRAYS, true)
                .configure(DeserializationFeature.FAIL_ON_IGNORED_PROPERTIES, false)
                .configure(DeserializationFeature.FAIL_ON_INVALID_SUBTYPE, false)
                .configure(DeserializationFeature.FAIL_ON_NULL_FOR_PRIMITIVES, false)
                .configure(DeserializationFeature.FAIL_ON_NUMBERS_FOR_ENUMS, false)
                .configure(DeserializationFeature.FAIL_ON_READING_DUP_TREE_KEY, false)
                .configure(DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES, false)
                .configure(DeserializationFeature.FAIL_ON_UNRESOLVED_OBJECT_IDS, false)
                .configure(SerializationFeature.FAIL_ON_EMPTY_BEANS, false) //\\
                .setSerializationInclusion(JsonInclude.Include.NON_NULL)
                .setVisibility(PropertyAccessor.ALL, JsonAutoDetect.Visibility.ANY);
    }
}

Cache CassandraRepository

@Repository
@CacheConfig(cacheNames = Util.CACHE_CONFIG)
public interface ConfigurationDao extends CassandraRepository<Configuration> {
    @Query("Select * from configuration where name=?0")
    @Cacheable
    Configuration findByName(String name);

    @Query("Delete from configuration where name=?0")
    @CacheEvict
    void delete(String name);

    @Override
    @CacheEvict(key = "#p0.name")
    void delete(Configuration config);

    /*
     * Check https://docs.spring.io/spring/docs/current/spring-framework-reference/html/cache.html
     * about what #p0 means
     */
    @Override
    @SuppressWarnings("unchecked")
    @CachePut(key = "#p0.name")
    Configuration save(Configuration config);

    /*
     * This API doesn't work very well with cache - as spring cache doesn't support put or evict
     * multiple keys. Call save(Configuration config) in a loop instead.
     */
    @Override
    @CacheEvict(allEntries = true)
    @Deprecated
    <S extends Configuration> Iterable<S> save(Iterable<S> configs);

    /*
     * This API doesn't work very well with cache - as spring cache doesn't support put or evict
     * multiple keys. Call delete(Configuration config) in a loop instead.
     */
    @Override
    @CacheEvict(allEntries = true)
    @Deprecated
    void delete(Iterable<? extends Configuration> configs);
}

Admin API to Manage Cache
We inject CacheManager to add or evict data from Redis.
But to scan all keys in a cache(like: cofig), I need to use stringRedisTemplate.opsForZSet() to get keys of the cache:
- as the value in the cache (config), its value is a list of string keys. So here I need use StringRedisTemplate to read it.

After get the keys, I use redisTemplate.opsForValue().multiGet to get their values.

- I will update this post if I find some better ways to do this.


public class CacheResource {
    private static final String REDIS_CACHE_SUFFIX_KEYS = "~keys";
    @Autowired
    @Qualifier("redisTemplate")
    RedisTemplate<String, Object> redisTemplate;

    @Autowired
    @Qualifier("stringRedisTemplate")
    StringRedisTemplate stringRedisTemplate;

    @Autowired
    private CacheManager cacheManager;

    /**
     * If sessionId is not null, return its associated user info.<br>
     * It also returns other cached data: they are small data.
     *
     * @return
     */
    @GetMapping(produces = MediaType.APPLICATION_JSON_VALUE, path = "/cache")
    public Map<String, Object> get(@RequestParam("sessionIds") final String sessionIds,
            @RequestParam(name = "getConfig", defaultValue = "false") final boolean getConfig) {
        final Map<String, Object> resultMap = new HashMap<>();
        if (getConfig) {
            final Set<String> configKeys =
                    stringRedisTemplate.opsForZSet().range(Util.CACHE_CONFIG_DAO + REDIS_CACHE_SUFFIX_KEYS, 0, -1);
            final List<Object> objects = redisTemplate.opsForValue().multiGet(configKeys);
            resultMap.put(Util.CACHE_CONFIG + REDIS_CACHE_SUFFIX_KEYS, objects);
        }
        if (StringUtils.isNotBlank(sessionIds)) {
            final Map<String, Object> sessionIdToUsers = new HashMap<>();
            final Long totalUserCount = stringRedisTemplate.opsForZSet().size(Util.CACHE_USER + REDIS_CACHE_SUFFIX_KEYS);
            sessionIdToUsers.put("totalUserCount", totalUserCount);
            final ArrayList<String> sessionIdList = Lists.newArrayList(Util.COMMA_SPLITTER.split(sessionIds));
            final List<Object> sessionIDValues = redisTemplate.opsForValue().multiGet(sessionIdList);
            for (int i = 0; i < sessionIdList.size(); i++) {
                sessionIdToUsers.put(sessionIdList.get(i), sessionIDValues.get(i));
            }
            resultMap.put(Util.CACHE_USER + REDIS_CACHE_SUFFIX_KEYS, sessionIdToUsers);
        }
        return resultMap;
    }

    @DeleteMapping("/cache")
    public void clear(@RequestParam("removeSessionIds") final String removeSessionIds,
            @RequestParam(name = "clearSessions", defaultValue = "false") final boolean clearSessions,
            @RequestParam(name = "clearConfig", defaultValue = "false") final boolean clearConfig) {
        if (clearConfig) {
            final Cache configCache = getConfigCache();
            configCache.clear();
        }
        final Cache userCache = getUserCache();
        if (clearSessions) {
            userCache.clear();
        } else if (StringUtils.isNotBlank(removeSessionIds)) {
            final ArrayList<String> sessionIdList = Lists.newArrayList(Util.COMMA_SPLITTER.split(removeSessionIds));
            for (final String sessionId : sessionIdList) {
                userCache.evict(sessionId);
            }
        }
    }

    /**
     * Only handle client() data - as other caches such as configuration we can use server side api
     * to update them
     */
    @PutMapping("/cache")
    public void addOrupdate(...) {
        if (newUserSessions == null) {
            return;
        }
        final Cache userCache = getUserCache();
        // userCache.put to add key, value
    }

    private Cache getConfigCache() {
        return cacheManager.getCache(Util.CACHE_CONFIG_DAO);
    }

    private Cache getUserCache() {
        return cacheManager.getCache(Util.CACHE_USER);
    }
}

StringRedisTemplate

@Bean("stringRedisTemplate")
public StringRedisTemplate stringRedisTemplate(final JedisConnectionFactory cf, final ObjectMapper objectMapper) {
    final StringRedisTemplate redisTemplate = new StringRedisTemplate();
    redisTemplate.setConnectionFactory(cf);
    return redisTemplate;
}

Misc
- If you want to disable cache in some env, use NoOpCacheManager.
- When debug, check code:
CacheAspectSupport.execute
SpringCacheAnnotationParser.parseCacheAnnotations

Spring @Cacheable Not Working - How to Troubleshoot and Solve it

The Scenario
Spring cache abstraction annotation is easy to add cache abilty to the application: Just define CacheManager in confutation, then use annotations: @Cacheable, @CachePut, @CacheEvict to use and maintain the cache.

But what to do if the cache annotation seems doesn't work?

How we know cache doesn't work?
Logging
We can change the log level to print database query in the log. For Cassandra, we can change log level of com.datastax.driver.core.RequestHandler to TRACE.
Debug
We can set a breakpoint at CacheAspectSupport.execute.
If cache works, when we call a method annotated with cache annotation, it will not directly call the method, instead it will be intercepted and hit the breakpoint.
Test

Possible Root Causes
1. The class using cache annotation inited too early
This usually happens when we use @Cache annotated classes in configuration or AOP class.

Spring first creates configuration and AOP class which then cause beans of @Cache annotated classes created before cache config is correctly setup. This makes these beans created without handling @Cache.

Please check Bean X of type Y is not eligible for getting processed by all BeanPostProcessors for detailed explanation

How to troubleshoot
Add a breakpoint at the default constructor of the bean (add one if not exist), then from the stack trace we can figure out why and which bean (or configuration class) causes this bean to be created.

Using @Lazy or ObjectFactory or other approaches to break the eager dependency, restart and check again util the cached method works as expected.

Also check whether there is log like: Bean of type is not eligible for getting processed by all BeanPostProcessors (for example: not eligible for auto-proxying)

2. Calling cached method in same class
Solutions:
- Using self inject or applicationContext.getBean then use the bean to call cached method
- Using @Scope(proxyMode = ScopedProxyMode.TARGET_CLASS)

Script to Setup SolrCloud Environment

Senario
Here is the script how to create SolrCloud environment in local dev setup: a collection with numShards=2&replicationFactor=3 on 3 separate (local) nodes.

vms_solr_init creates folders: example/cloud/node{1,2,3}/solr, copy solr.xml and zoo.cfg to these folders, starts the server and creates the collection using admin collection apis.

Other scripts which start/stop are easier to implement.
The implementation

function solr_init()
{
  cd $SOLR_HOME
  mkdir -p $SOLR_NODE1_REL_HOME
  mkdir -p $SOLR_NODE2_REL_HOME
  mkdir -p $SOLR_NODE3_REL_HOME

  cp $SOLR_HOME/server/solr/solr.xml $SOLR_HOME/server/solr/zoo.cfg $SOLR_NODE1_REL_HOME
  cp $SOLR_HOME/server/solr/solr.xml $SOLR_HOME/server/solr/zoo.cfg $SOLR_NODE2_REL_HOME
  cp $SOLR_HOME/server/solr/solr.xml $SOLR_HOME/server/solr/zoo.cfg $SOLR_NODE3_REL_HOME

  solr_start
  data_solr_create
}

function solr_start() {
  if [[ `solr_pid` ]]
  then
    echo "solr is already running...";
  else
    echo "Starting solr-cloud... $SOLR_NODE1_PORT, $SOLR_NODE2_PORT, $SOLR_NODE3_PORT";
    $SOLR_HOME/bin/solr start -cloud -Dsolr.ltr.enabled=true -s "$SOLR_NODE1_REL_HOME" -p $SOLR_NODE1_PORT -h $SOLR_HOSTNAME;
    $SOLR_HOME/bin/solr start -cloud -Dsolr.ltr.enabled=true -s "$SOLR_NODE2_REL_HOME" -p $SOLR_NODE2_PORT -z $SOLR_ZKHOST -h $SOLR_HOSTNAME;
    $SOLR_HOME/bin/solr start -cloud -Dsolr.ltr.enabled=true -s "$SOLR_NODE3_REL_HOME" -p $SOLR_NODE3_PORT -z $SOLR_ZKHOST -h $SOLR_HOSTNAME;
  fi
}

function solr_stop() {
  echo "Stopping solr-cloud...";
  $SOLR_HOME/bin/solr stop -all;
}

function solr_restart() {
  echo "Restarting solr-cloud...";
  solr_stop && solr_start
}

function solr_pid() {
  pgrep -f "solr-6.4.0/server";
}

function data_solr_create() {
  # Go to the solr config directory
  currdir=`pwd`;
  cd "$WS/resource/solr";

  # Retrieve list of collections
  collections_list=`curl -s -v -X GET  -H 'Content-type:application/json' "$SOLR_NODE1_PORT/admin/collections?action=LIST&wt=json" | jq '.collections | join(" ")' `;

  # Create/update schema
  mv solrconfig solrconfig.old.`datetimestamp`;
  unzip -d solrconfig solr-core-config.zip;

  # create myCollection
  cp myCollection_solrconfig.xml solrconfig/conf/solrconfig.xml;
  cp myCollection_schema.xml solrconfig/conf/schema.xml;
  $SOLR_HOME/server/scripts/cloud-scripts/zkcli.sh -zkhost "$SOLR_ZKHOST" -cmd upconfig -confname myCollection -confdir "solrconfig/conf/";
  if grep -q "myCollection" <<< $collections_list; then
    curl -s -v -X GET "$SOLR_NODE1_ENDPOINT/admin/collections?action=RELOAD&name=myCollection";
    echo "Updated myCollection";
  else
    curl -s -v -X GET "$SOLR_NODE1_ENDPOINT/admin/collections?action=CREATE&name=myCollection&numShards=2&collection.configName=myCollection&replicationFactor=3&maxShardsPerNode=2";

    echo "Created myCollection";
  fi

  rm -rf solrconfig;
  cd $currdir;
}

Java APIs to Get and Update Solr Configuration Files

User Case
SolrCloud stores its configuration files (for example: elevate.xml) in Zookeeper. Usually we need APIs that clients(for example UI) can call to get these configuration files or update.

Related: Build Web Service APIs to Update Solr's Managed Resources (stop words, synonyms)

The Implementation
We use SolrJ's SolrZkClient APIs to get data, make znode, and set znode's data.


public String getConfigData(final String filePath) {
    final ZkStateReader zkStateReader = getZKReader(getSolrClient());
    final String path = normalizeConfigPath(filePath);
    final SolrZkClient zkClient = zkStateReader.getZkClient();
    try {
        return new String(zkClient.getData(path, null, null, true));
    } catch (KeeperException | InterruptedException e) {
        throw new BusinessException(ErrorCode.data_access_error, e, "Failed to get " + path);
    }
}
public void setConfigData(final String filePath, final String data, final boolean createPath,
        final boolean reloadCollection) {
    Validate.notNull(filePath);
    Validate.notNull(data);
    final ZkStateReader zkStateReader = getZKReader(getSolrClient());
    final String path = normalizeConfigPath(filePath);
    final SolrZkClient zkClient = zkStateReader.getZkClient();
    try {
        if (createPath) {
            zkClient.makePath(path, false, true);
        }
        zkClient.setData(path, data.getBytes(), true);

        if (reloadCollection) {
            reloadCollection();
        }
    } catch (KeeperException | InterruptedException e) {
        throw new BusinessException(ErrorCode.data_access_error, e, "Failed to get " + path);
    }
}

public void reloadCollection() {
    try {
        final CollectionAdminRequest.Reload reload = new CollectionAdminRequest.Reload();
        reload.setCollectionName(getSolrClient().getDefaultCollection());
        final CollectionAdminResponse response = reload.process(getSolrClient());
        logger.info(MessageFormat.format("reload collection: {0} rsp: {1}", getSolrClient().getDefaultCollection(),
                response));
        final int status = response.getStatus();
        if (status != 0) {
            throw new BusinessException(ErrorCode.data_access_error,
                    "Failed to reload collection, status: " + status);
        }
    } catch (SolrServerException | IOException e) {
        throw new BusinessException(ErrorCode.data_access_error,
                "Failed to reload collection: " + getSolrClient().getDefaultCollection());
    }
}

public static ZkStateReader getZKReader(final CloudSolrClient solrClient) {
    final ZkStateReader zkReader = solrClient.getZkStateReader();
    if (zkReader == null) {
        // This only happens when we first time call solrClient to do anything
        // Usually we will call solrClient to do something during abolition starts: such as
        // healthCheck, so in most cases, its already connected.
        solrClient.connect();
    }
    return solrClient.getZkStateReader();
}
/**
 * @param filePath
 * @return add prefix to make elevate.xml - /configs/myCollection/elevate.xml
 */
private String normalizeConfigPath(final String filePath) {
    return ZkStateReader.CONFIGS_ZKNODE + "/" + getSolrClient().getDefaultCollection() + "/" + filePath;
}

Resources
Build Web Service APIs to Update Solr's Managed Resources (stop words, synonyms)

Java APIs to Build Solr Suggester and Get Suggestion

User Case
Usually we provide Rest APIs to manage Solr, same for suggestor.
This article focuses on how to programmatically build Solr suggester and get suggestions using java code.

The implementation
Please check the end of the article for Solr configuration files.

Build Suggester
In Solr, after we add docs to Solr, we call suggest?suggest.build=true to build the suggestor to make them available for autocompletion.

The only trick here is the suggest.build request doesn't build suggester for all cores in the collection, BUT only builds suggester to the core that receives the request.

We need get all replicas urls of the collection, add them into shards parameter, and also add shards.qt=/suggest:
shards=127.0.0.1:4567/solr/myCollection_shard1_replica3,127.0.0.1:4565/solr/myCollection_shard1_replica2,127.0.0.1:4566/solr/myCollection_shard1_replica1,127.0.0.1:4567/solr/myCollection_shard2_replica3,127.0.0.1:4566/solr/myCollection_shard2_replica1/,127.0.0.1:4565/solr/myCollection_shard2_replica2&shards.qt=/suggest


public void buildSuggester() {
    final SolrQuery solrQuery = new SolrQuery();
    final List<String> urls = getAllSolrCoreUrls(getSolrClient());

    solrQuery.setRequestHandler("/suggest").setParam("suggest.build", "true")
            .setParam(ShardParams.SHARDS, COMMA_JOINER.join(urls))
            .setParam(ShardParams.SHARDS_QT, "/suggest");
    try {
        final QueryResponse queryResponse = getSolrClient().query(solrQuery);
        final int status = queryResponse.getStatus();
        if (status >= 300) {
            throw new BusinessException(ErrorCode.data_access_error,
                    MessageFormat.format("Failed to build suggestions: status: {0}", status));
        }
    } catch (SolrServerException | IOException e) {
        throw new BusinessException(ErrorCode.data_access_error, e, "Failed to build suggestions");
    }
}
public static List<String> getAllSolrCoreUrls(final CloudSolrClient solrClient) {
    final ZkStateReader zkReader = getZKReader(solrClient);
    final ClusterState clusterState = zkReader.getClusterState();

    final Collection<Slice> slices = clusterState.getSlices(solrClient.getDefaultCollection());
    if (slices.isEmpty()) {
        throw new BusinessException(ErrorCode.data_access_error, "No slices");
    }
    return slices.stream().map(slice -> slice.getReplicas()).flatMap(replicas -> replicas.stream())
            .map(replica -> replica.getCoreUrl()).collect(Collectors.toList());
}

private static ZkStateReader getZKReader(final CloudSolrClient solrClient) {
    final ZkStateReader zkReader = solrClient.getZkStateReader();
    if (zkReader == null) {
        // This only happens when we first time call solrClient to do anything
        // Usually we will call solrClient to do something during abolition starts: such as
        // healthCheck, so in most cases, its already connected.
        solrClient.connect();
    }
    return solrClient.getZkStateReader();
}

Get Suggestions


public Set<SearchSuggestion> getSuggestions(final String prefix, final int limit) {
   final Set<SearchSuggestion> result = new LinkedHashSet<>(limit);
   try {
       final SolrQuery solrQuery = new SolrQuery().setRequestHandler("/suggest").setParam("suggest.q", prefix)
               .setParam("suggest.count", String.valueOf(limit)).setParam(CommonParams.TIME_ALLOWED,
                       mergedConfig.getConfigByNameAsString("search.suggestions.time_allowed.millSeconds"));
       // context filters
       solrQuery.setParam("suggest.cfq", getContextFilters());
       final QueryResponse queryResponse = getSolrClient().query(solrQuery);
       if (queryResponse != null) {
           final SuggesterResponse suggesterResponse = queryResponse.getSuggesterResponse();
           final Map<String, List<Suggestion>> map = suggesterResponse.getSuggestions();
           final List<Suggestion> infixSuggesters = map.get("infixSuggester");
           if (infixSuggesters != null) {
               for (final Suggestion suggester : infixSuggesters) {
                   if (result.size() < limit) {
                       result.add(new SearchSuggestion().setText(suggester.getTerm())
                               .setHighlightedText(replaceTagB(suggester.getTerm())));
                   } else {
                       break;
                   }
               }
           }
       }
       logger.info(
               MessageFormat.format("User: {0}, query: {1}, limit: {2}, result: {3}", user, query, limit, result));
       return result;
   } catch (final Exception e) {
       throw new BusinessException(ErrorCode.data_access_error, e, "Failed to get suggestions for " + query);
   }
}
private static final Pattern TAGB_PATTERN = Pattern.compile("<b>|</b>");
public static String replaceTagB(String input)
{
    return TAGB_PATTERN.matcher(input).replaceAll("");
}

Schema.xml
We define textSuggest and suggesterContextField, copy fields which are shown in the autocompletion to textSuggest field, and copy filter fields such as zipCodes, genres to suggesterContextField.

Solr suggester supports filters on multiple fields, all we just need copy all these filter fields to suggesterContextField.


<field name="suggester" type="textSuggest" indexed="true"
  stored="true" multiValued="true" />
<field name="suggesterContextField" type="string" indexed="true" stored="true"
  multiValued="true" />

<copyField source="seriesTitle" dest="suggester" />
<copyField source="programTitle" dest="suggester" />

<copyField source="zipCodes" dest="suggesterContextField" />
<copyField source="genres" dest="suggesterContextField" />

SolrConfig.xml
We can add multiple suggester implementations to searchComponent. Another very useful is FileDictionaryFactory which allows us to using an external file that contains suggest entries. We may use it in future.


<searchComponent name="suggest" class="solr.SuggestComponent">
  <lst name="suggester">
    <str name="name">infixSuggester</str>
    <str name="lookupImpl">BlendedInfixLookupFactory</str>
    <str name="dictionaryImpl">DocumentDictionaryFactory</str>
    <str name="blenderType">position_linear</str>
    <str name="field">suggester</str>
    <str name="contextField">suggesterContextField</str>
    <str name="minPrefixChars">4</str>
    <str name="suggestAnalyzerFieldType">textSuggest</str>
    <str name="indexPath">infix_suggestions</str>
    <str name="highlight">true</str>
    <str name="buildOnStartup">false</str>
    <str name="buildOnCommit">false</str>
  </lst>
</searchComponent>

<requestHandler name="/suggest" class="solr.SearchHandler"
  >
  <lst name="defaults">
    <str name="suggest">true</str>
    <str name="suggest.dictionary">infixSuggester</str>
    <str name="suggest.onlyMorePopular">true</str>
    <str name="suggest.count">10</str>
    <str name="suggest.collate">true</str>
  </lst>
  <arr name="components">
    <str>suggest</str>
  </arr>
</requestHandler>

Resources
Solr Suggester

Using YAML Configuration Files in Spring Boot

The Scenario
Sometimes, some developers like to use Yaml property files in Spring-boot.

This tutorial tells how: how to load properties for correct profile and make their properties avaliable in Environment. So wen can use appContext.getEnvironment().getProperty to get property value in static or non-spring-managed context.

The implementation
EnvironmentAwarePropertySourcesPlaceholderConfigurer
First we will create EnvironmentAwarePropertySourcesPlaceholderConfigurer: we can use its addYamlPropertySource to add yaml file which will load properties defined for de the active profile and default properties.


/**
 * From http://jdpgrailsdev.github.io/blog/2014/12/30/groovy_script_spring_boot_yaml.html. This will
 * add propertySources into environment, so we can use environment().getProperty to get property
 * value in some cases.
 */
public class EnvironmentAwarePropertySourcesPlaceholderConfigurer extends PropertySourcesPlaceholderConfigurer
        implements EnvironmentAware, InitializingBean {

    private List<PropertySource<?>> propertySources = new ArrayList<>();
    private ConfigurableEnvironment environment;

    public EnvironmentAwarePropertySourcesPlaceholderConfigurer() {}

    /**
     * @param propertySources: order: abc-default.yaml, abc-{env}.yaml <br>
     *        This makes its usage same as org.springframework.context.annotation.PropertySource
     */
    public EnvironmentAwarePropertySourcesPlaceholderConfigurer(@Nonnull final List<PropertySource<?>> propertySources) {
        this.propertySources = propertySources;
    }

    @Override
    public void setEnvironment(final Environment environment) {
        // all subclasses extend ConfigurableEnvironment
        this.environment = (ConfigurableEnvironment) environment;
        super.setEnvironment(environment);
    }

    public EnvironmentAwarePropertySourcesPlaceholderConfigurer addYamlPropertySource(@Nonnull Resource resource)
            throws IOException {
        return addYamlPropertySource(resource.getFilename(), resource);
    }

    public EnvironmentAwarePropertySourcesPlaceholderConfigurer addYamlPropertySource(@Nonnull String name,
            @Nonnull Resource resource) throws IOException {
        YamlPropertySourceLoader loader = new YamlPropertySourceLoader();
        // order: abc-default.yaml, abc-{env}.yaml
        PropertySource<?> defaultYamlPropertySource = loader.load(name + ".defualt", resource, null);
        propertySources.add(defaultYamlPropertySource);
        PropertySource<?> applicationYamlPropertySource =
                loader.load(name + "." + System.getProperty("env"), resource, System.getProperty("env"));
        propertySources.add(applicationYamlPropertySource);
        return this;
    }

    @Override
    public void afterPropertiesSet() throws Exception {
        // This will ad it as abc-{env}.properties, abc-default.properties into
        // environment.propertySource
        // spring get value from the first propertySource which defines the property and return it.
        // check org.springframework.core.env.PropertySourcesPropertyResolver.getProperty(String,
        // Class<T>, boolean)
        if (propertySources != null) {
            propertySources.forEach(propertySource -> environment.getPropertySources().addFirst(propertySource));
        }
    }
}

PropertySourcesPlaceholderConfigurer
Then we create static PropertySourcesPlaceholderConfigurer and load yaml files. Due to spring restriction, we can't use yaml file in @PropertySource

public static PropertySourcesPlaceholderConfigurer propertyConfig() throws IOException {
    final String password = System.getenv(ENV_APP_ENCRYPTION_PASSWORD);
    if (StringUtils.isBlank(password)) {
        return new EnvironmentAwarePropertySourcesPlaceholderConfigurer()
                .addYamlPropertySource(new ClassPathResource("cassandra.yaml")); // add more
    }
    return new EncryptedPropertySourcesPlaceholderConfigurer(password)
            .addYamlPropertySource(new ClassPathResource("cassandra.yaml"));
}

Check Spring - Encrypt Properties by Customizing PropertySourcesPlaceholderConfigurer to know more how to implement EncryptedPropertySourcesPlaceholderConfigurer.

Resources
Leverage Spring Boot’s YAML Configuration Files in Groovy Scripts

Build Web Service APIs to Update Solr's Managed Resources (stop words, synonyms)

User Case

Solr provides Rest API to update managed resources such as stop words, synonyms and etc. But usually we will wrap it and provide Rest api in our application layer, so admin can do solr admin operation in UI.

Also usually we use CloudSolrClient and prefer to use solrj api over make directly rest api to solr server - as usually our application only knows address of zookeeper servers, not address of solr servers.

The Implementation

We first create generic APIs: getManagedResource, addManagedResource and deleteManagedResource. Then we call them to manage stop words, synonyms.

We use spring-data-solr's SolrJsonRequest in getManagedResource and addManagedResource which can help parse json's response.

deleteManagedResource is more complex - we can't use solrJ directly
as org.apache.solr.client.solrj.SolrRequest.METHOD only supports GET, POST, PUT not DELETE.

Here I use CloudSolrClient's Apache HttpClient to send HttpDelete, use ZkStateReader and ClusterState to get one address of live solr nodes.

The Rest APIs just call these methods.


public String getManagedResource(final String path) {
    try {
        final SolrJsonRequest request = new SolrJsonRequest(METHOD.GET, path);
        return request.process(this.getSolrClient()).getJsonResponse();
    } catch (SolrServerException | IOException e) {
        throw new MyServerException(e, "Failed to get " + path);
    }
}
public void addManagedResource(final String path, final Object content, final boolean reloadCollection) {
    final SolrJsonRequest request = new SolrJsonRequest(METHOD.PUT, path);
    request.addContentToStream(content);
    try {
        final SolrJsonResponse response = request.process(this.getSolrClient());
        final int status = response.getStatus();
        logger.info(MessageFormat.format("add resource: {0}, status: {1}, result: {2}", path, status,
                response.getJsonResponse()));
        if (status != 0) {
            throw new MyServerException(ErrorCode.data_access_error,
                    MessageFormat.format("Failed to add resource, path: {0}, status: {1}", path, status));
        }
        if (reloadCollection) {
            reloadCollection();
        }
    } catch (SolrServerException | IOException | InterruptedException e) {
        throw new MyServerException(e, "Failed to add resource: " + path);
    }
}
public void deleteManagedResource(@Nonnull final List<String> paths, final boolean reloadCollection) {
    try {
        Preconditions.checkNotNull(paths);
        final String solrUrl = getOneSolrServerUrl(getSolrClient());
        final List<String> done = new ArrayList<>(paths.size());
        for (final String path : paths) {
            final HttpDelete request = new HttpDelete(solrUrl + path);

            final HttpResponse response = getSolrClient().getHttpClient().execute(request);
            final String entity = EntityUtils.toString(response.getEntity());
            logger.info(MessageFormat.format("delete path: {0}, result: {1}", path, entity));
            final ObjectMapper objectMapper = Util.createFailSafeObjectmapper();
            final Map<String, Object> resultMap = objectMapper.readValue(entity,
                    objectMapper.getTypeFactory().constructMapLikeType(Map.class, String.class, Object.class));
            final Map<String, Object> responseHeader = (Map<String, Object>) resultMap.get("responseHeader");
            if (responseHeader != null) {
                final int status = Integer.valueOf(responseHeader.get("status").toString());
                // ignore 404 which means it's already deleted
                if (status != 0 && status != 404) {
                    throw new MyServerException(MessageFormat.format(
                            "Failed to delete path: {0}, status: {1}, already deleted: {2}", path, status, done));
                }
            }
            done.add(path);
        }
        if (reloadCollection) {
            this.reloadCollection();
        }
    } catch (IOException | SolrServerException | InterruptedException e) {
        throw new MyServerException(ErrorCode.data_access_error, e, "Failed to delete path: " + paths);
    }
}

public String getSynonyms(final String language) {
    return getManagedResource("/schema/analysis/synonyms/" + language);
}
public void addSynonyms(final String language, final List<Object> synonymes, final boolean reloadCollection) {
    for(Object synonyms: synonymes){
        addManagedResource("/schema/analysis/synonyms/" + language, synonyms, reloadCollection);
    }
}
public void deleteSynonyms(final String language, final List<String> synonyms, final boolean reloadCollection) {
    if (CollectionUtils.isNotEmpty(synonyms)) {
        deleteManagedResource(synonyms.stream().map(synonym -> {
            return MessageFormat.format("/schema/analysis/synonyms/{0}/{1}", language,
                    Util.encodeAsUtf8(synonym));
        }).collect(Collectors.toList()), reloadCollection);
    }
}

public void addStopWords(final String language, final List<String> stopWords, final boolean reloadCollection) {
    addManagedResource("/schema/analysis/stopwords/" + language, stopWords, reloadCollection);
}
public String getStopWords(final String language) {
    return getManagedResource("/schema/analysis/stopwords/" + language);
}
public void deleteStopWords(final String language, final List<String> stopWords, final boolean reloadCollection) {
    if (CollectionUtils.isNotEmpty(stopWords)) {
        deleteManagedResource(stopWords.stream().map(synonym -> {
            return MessageFormat.format("/schema/analysis/stopwords/{0}/{1}", language,
                    Util.encodeAsUtf8(synonym));
        }).collect(Collectors.toList()), reloadCollection);
    }
}


public static String getOneSolrServerUrl(CloudSolrClient solrClient)
{
    final ZkStateReader zkReader =solrClient.getZkStateReader();
    final ClusterState clusterState = zkReader.getClusterState();
    final SetString> liveNodes = clusterState.getLiveNodes();
    
    if (liveNodes.isEmpty()) {
        throw new MyServerException(ErrorCode.data_access_error, "No lobe nodes");
    }
    return zkReader.getBaseUrlForNodeName(new TreeSet<>(liveNodes).iterator().next()) + "/" + solrClient.getDefaultCollection();   
}

Resource
Managed Resources
Using Solr’s REST APIs to manage stop words and synonyms

Labels