Friday, 15 August 2014

My New Home on GitHub

Dear visitors,

I am glad to announce that I have migrated my blog to GitHub, and will be working on there from now on.

Just like everyone's got to move on to new exciting stuff at some stage in the life, I'll spend less time on maintaining this blog. It is sad, but shouldn't we all be looking forward to new challenges? And rest assured, I will not be closing down this blog. 

Until then, keep well :D

Meng

Wednesday, 23 July 2014

Optimised Way of Modifying a String in Java

It probably rarely occurs to people there could be a small bump on performance when manipulating strings in Java, since the computing power nowadays also considered to be infinite, and nobody will really try to add a dozen dead sea scrolls into one string or tweak it.

Why should I bother!?

Yes, when it comes to critical systems, or simply for optimisation's sake, devs should be more conscious about what object to use when altering a string.

As a starter, String is the most straight forward object to use in Java. The good thing about string object is that it is an immutable object, which means it will NOT change after it is instantiated, and any modification will cause a copy of the object be created. There is a lot of benefits to this implementation like those. But when it comes down to mass modifications, it is more of a pain in the jacksie.

What happens now, panic?

Of course not, the clever people before us have not only come across the problem but also resolved it. And that's why, I'm going to show case the solutions so everyone can benefit from the great work.

The objects to replace simple string are StringBuilder and StringBuffer

Since both objects implemented the same interfaces, the methods available are almost identical. Of course I didn't compare the methods line by line, but you can take my word for it that there are only every subtle differences between them. 

So what's the difference?

In one sentence, StringBuffer is synchronised while StringBuilder is not.

That makes all the difference, StringBuffer is considered safer to use in multi-thread environment while it creates unnecessary overhead for unsynchronised scenarios. But bear in mind, the masterminds in Sun (previously) perceive the world differently, StringBuffer actually came before StringBuilder. True story.

Yack yack yack, prove it...

public class Main {
    public static void main(String[] args) {
        int N = 100000;
        long t;

      {
           String s = "";
           t = System.currentTimeMillis();
           for (int i = N; i --> 0 ;) {
             s += i;
           }
           System.out.println(System.currentTimeMillis() -t);
      }

        {
            StringBuffer sb = new StringBuffer();
            t = System.currentTimeMillis();
            for (int i = N; i --> 0 ;) {
                sb.append(i);
            }
            System.out.println(System.currentTimeMillis() - t);
        }

        {
            StringBuilder sb = new StringBuilder();
            t = System.currentTimeMillis();
            for (int i = N; i --> 0 ;) {
                sb.append(i);
            }
            System.out.println(System.currentTimeMillis() - t);
        }
    }

}

Conclusion

It is obvious, right? Use the StringBuilder (Force), Luke!

Wednesday, 2 July 2014

A Basic Twitter Message Queue Service using ActiveMQ and WebSocket

I've been planning to write this for a long time, since I got so fascinated by the simplicity and elegance of message queue, and the performance boost it could bring to various systems. So here we are.

TL;DR

If you just want to see an example, or make sure you get the right dependencies: boom, here you go!

What is Message Queue?

Message Queue is the storage area of a mechanism, which allows distributed applications to communicate asynchronously by sending messages between the applications

Why Message Queue?

I think you can probably google a dozen reasons why you should do it. Speaking from my experience working in various integration projects, message queue, is the pursue for high performance, high scalability, high resilience and low coupling, while accomplishing asynchronous communication, buffering and filtering at the same time.

What Choices Do I Have?

Rather than trying to implement you own message queueing, here is some most notable MQ  implementations: ActiveMQ, RabbitMQ and ZeroMQ. 

Blah Blah Blah, Give Me an Example...

The example I put together is a middle layer between twitter API and our applications, using ActiveMQ and WebSocket to implement the messaging. 

Its purpose is to make sure we are able to deliver tweets across different applications in a consistent and timely fashion. While to be precise, it is essentially to ensure we have a mechanism to filter out some pottery mouthes, control and monitor traffics to our applications.

1. Install ActiveMQ

Use brew install, then type in command 'activemq start'. Job done.

To verify the service is actually running, type in command 'netstat -an | grep 61616', or browse to 'http://localhost:8161'

As basic usage, we are only interested in 'Queues', where the number of consumers, message queued and dequeued are displayed in the dashboard.

2. Java 8

Java in Mac is a pain in the neck, to make sure you can use Java 8, go to oracle, download the package and install it.

Slightly trickier to set the configure the Mac. In terminal, type in 

'cd /System/Library/Frameworks/JavaVM.framework/Versions/' 

which should bring you to a list of JDK available on your Mac.

Remove 'CurrentJDK' and symlink it to the version you want. Using the command: 

'rm CurrentJDK'

'ln -s cd /Library/Java/JavaVirtualMachines/jdk1.8.0_05.jdk/Contents/ CurrentJDK'

3. Producer and Consumer

The whole message queueing fits in producer-consumer pattern perfectly. On one hand, the producer will stock the new tweets into our container, ActiveMQ. On the other, the consumer will fetch the messages from the container and process them.

The producer in the simplest form:

import javax.jms.*;

import org.apache.activemq.ActiveMQConnection;
import org.apache.activemq.ActiveMQConnectionFactory;

public class Producer {
    private static String url = ActiveMQConnection.DEFAULT_BROKER_URL; // Configure to use localhost
    private static String subject = "TWEETQUEUE"; // This is the name of the queue you will see in ActiveMQ dashboard

    public static void main(String[] args) throws JMSException {
        ConnectionFactory connectionFactory =
            new ActiveMQConnectionFactory(url);
        Connection connection = connectionFactory.createConnection();
        connection.start();

        Session session = connection.createSession(false,
            Session.AUTO_ACKNOWLEDGE);
        Destination destination = session.createQueue(subject);
        MessageProducer producer = session.createProducer(destination);
        TextMessage message = session.createTextMessage("A new message");
        producer.send(message);

        connection.close();
    }
}

The consumer in its simplest form:

import javax.jms.*;

import org.apache.activemq.ActiveMQConnection;
import org.apache.activemq.ActiveMQConnectionFactory;

public class Consumer {
    private static String url = ActiveMQConnection.DEFAULT_BROKER_URL;
    private static String subject = "TWEETQUEUE";

    public static void main(String[] args) throws JMSException {
        ConnectionFactory connectionFactory
            = new ActiveMQConnectionFactory(url);
        Connection connection = connectionFactory.createConnection();
        connection.start();

        Session session = connection.createSession(false,
            Session.AUTO_ACKNOWLEDGE);
        Destination destination = session.createQueue(subject);
        MessageConsumer consumer = session.createConsumer(destination);
        Message message = consumer.receive();

        if (message instanceof TextMessage) {
            TextMessage textMessage = (TextMessage) message;
            System.out.println("Message: "
                + textMessage.getText());
        }

        connection.close();
    }
}

4. Twitter Integration

To make devs' life easier, Twitter provides a nice HTTP client, called hbc. The Twitter API is an authorisation-based, which means you either provide your account and password in the configuration, or create a Twitter App, then use the token provided. I'd strongly recommend the later, because I can't get the account and password working, right? Surely I won't tell you even if that's true :) Google 'Why OAuth' if you are wondering why.

The exact example I referenced can be found in hbc, yet I blended in the Producer in this case, so that we are actually injecting tweets into ActiveMQ.

import com.google.common.collect.Lists;
import com.twitter.hbc.ClientBuilder;
import com.twitter.hbc.core.Constants;
import com.twitter.hbc.core.endpoint.StatusesFilterEndpoint;
import com.twitter.hbc.core.processor.StringDelimitedProcessor;
import com.twitter.hbc.httpclient.BasicClient;
import com.twitter.hbc.httpclient.auth.Authentication;
import com.twitter.hbc.httpclient.auth.OAuth1;

import org.apache.activemq.ActiveMQConnection;
import org.apache.activemq.ActiveMQConnectionFactory;

import javax.jms.*;

import java.util.concurrent.BlockingQueue;
import java.util.concurrent.LinkedBlockingQueue;
import java.util.concurrent.TimeUnit;

public class Tweet {
private static final String consumerKey = ""; 
private static final String consumerSecret = ""; 
private static final String token = ""; 
private static final String secret = "";
private static String url = ActiveMQConnection.DEFAULT_BROKER_URL;
private static String subject = "TWEETQUEUE";
    public static void main(String[] args) throws InterruptedException, JMSException {
    ConnectionFactory connectionFactory = new ActiveMQConnectionFactory(url);
        Connection connection = connectionFactory.createConnection();
        connection.start();
        
        Session session = connection.createSession(false, Session.AUTO_ACKNOWLEDGE);
        Destination destination = session.createQueue(subject);
        MessageProducer producer = session.createProducer(destination);
        
    BlockingQueue<String> queue = new LinkedBlockingQueue<String>(10000);
        StatusesFilterEndpoint endpoint = new StatusesFilterEndpoint();
        
        endpoint.followings(Lists.newArrayList(871686942L)); // @BBCOne
        
        Authentication auth = new OAuth1(consumerKey, consumerSecret, token, secret);
        
        BasicClient client = new ClientBuilder()
        .name("sampleExampleClient")
        .hosts(Constants.STREAM_HOST)
        .endpoint(endpoint)
        .authentication(auth)
        .processor(new StringDelimitedProcessor(queue))
        .build();
        
        client.connect();
        
        for (int msgRead = 0; msgRead < 1000; msgRead++) {
        if (client.isDone()) {
                System.out.println("Client connection closed unexpectedly: " + client.getExitEvent().getMessage());
                break;
        }
       
        String msg = queue.poll(5, TimeUnit.SECONDS);
        if (msg == null) {
                System.out.println("Did not receive a message in 5 seconds");
            } else {
                System.out.println(msg);
                TextMessage message = session.createTextMessage(msg);
                producer.send(message);
            }
        }
        
        client.stop();
    }
}

5. Send via WebSocket

Once the queue started filling with junks, errr, tweets. We need to find a way to let the messages out. There are quite a number of broadcasting mechanisms we can use, but bearing speed and performance in mind in the case of leaving network connection open and message wrapping, WebSocket seems to be a very trendy and fashionable choice :)

WARNING: if you DO need to deal with ancient browsers (prior 2011, like I am kidding, right), make sure WebSocket is supported!

import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.util.concurrent.BlockingQueue;
import java.util.concurrent.LinkedBlockingQueue;
import java.util.concurrent.TimeUnit;

import javax.jms.Connection;
import javax.jms.ConnectionFactory;
import javax.jms.Destination;
import javax.jms.Message;
import javax.jms.MessageConsumer;
import javax.jms.TextMessage;
import javax.websocket.CloseReason;
import javax.websocket.OnClose;
import javax.websocket.OnMessage;
import javax.websocket.OnOpen;
import javax.websocket.Session;
import javax.websocket.server.ServerEndpoint;

import org.apache.activemq.ActiveMQConnection;
import org.apache.activemq.ActiveMQConnectionFactory;
import org.apache.log4j.Logger;
import org.glassfish.tyrus.server.Server;
import org.json.JSONObject;

@ServerEndpoint(value = "/tweets")
public class TweetFeedServer {
    private static String url = ActiveMQConnection.DEFAULT_BROKER_URL;
    private static String subject = "TWEETQUEUE";
    private Logger logger = Logger.getLogger(this.getClass().getName());
    
    @OnOpen
    public void onOpen(Session session) {
        logger.info("Connected ... " + session.getId());
    }

    @OnMessage
    public void onMessage(String message, Session session) throws Exception {
        ConnectionFactory connectionFactory = new ActiveMQConnectionFactory(url);
        Connection connection = connectionFactory.createConnection();
        connection.start();

        javax.jms.Session jmsSession = connection.createSession(false, javax.jms.Session.AUTO_ACKNOWLEDGE);
        Destination destination = jmsSession.createQueue(subject);
        MessageConsumer consumer = jmsSession.createConsumer(destination);

        BlockingQueue<String> queue = new LinkedBlockingQueue<String>(10000);

        for (int msgRead = 0; msgRead < 1000; msgRead++) {
            String msg = queue.poll(5, TimeUnit.SECONDS);

            Message jmsMessage = consumer.receive();
            if (jmsMessage instanceof TextMessage){
                TextMessage textMessage = (TextMessage) jmsMessage;

                try {
                    JSONObject receivedMessage = new JSONObject(textMessage.getText());
                    JSONObject processedMessage = new JSONObject();

                    System.out.println("Processed message: " + receivedMessage);

                    processedMessage.put("name", receivedMessage.getJSONObject("user").getString("name"));
                    processedMessage.put("icon", receivedMessage.getJSONObject("user").getString("profile_image_url"));
                    processedMessage.put("message", receivedMessage.getString("text"));

                    System.out.println("Processed message: " + processedMessage);
                    session.getBasicRemote().sendObject(processedMessage);
                } catch(Exception e) {
                    System.out.println(e.getMessage());
                }
            }

        }

        connection.close();
    }

    @OnClose
    public void onClose(Session session, CloseReason closeReason) {
        logger.info(String.format("Session %s closed because of %s", session.getId(), closeReason));
    }
    
    // For testing only
    public static void main(String[] args) {
        Server server = new Server("localhost", 8025, "/websockets", TweetFeedServer.class);

        try {
            server.start();
            BufferedReader reader = new BufferedReader(new InputStreamReader(System.in));
            System.out.print("Please press a key to stop the server.");
            reader.readLine();
        } catch (Exception e) {
            throw new RuntimeException(e);
        } finally {
            server.stop();
        }
    }
}

Conclusion

The setup of ActiveMQ and Twitter Integration are pretty easily done, yet finding the way to produce and consume messages took a while to try, also deciding what and how to use implement the broadcasting part is quite fun.

The truly potential of Message Queue is surely more than this, the live async update, fast and lightweight queueing provide a limitless space for big scale systems integration.

I hope this blog does give you some ideas to think about.

PS: the all-in-one solution can be found on my GitHub.

A Quick Hack Into Nancy

Nancy, a very neat way to build HTTP based services on .NET and mono, has been on the market for quite a while (I suppose?). Since I work in a very interesting environment, I have not actually got a chance to take a look at the solution, until then.




As far as I understand, Nancy is the next generation web application framework in .NET, after Web Forms, MVC and WebAPI. Even more, Nancy provides a slick way to develop RESTful web services. Since Nancy's documentation is pretty comprehensive, I will just crack on the hack.

As usual, visual studio is the place we go. Since I am using Visual Studio 2013, there is a slight change to the way web application frameworks. There is only two selections left.

The Web Forms, MVC and WebAPI frameworks are now in a sub category in Web Application (perhaps Microsoft has got some evil plans?). But we just need an empty web project to kick off.

Here is all we have and need, minimum dlls and a web.config.

Then we basically need to choose how we want to host the application, and choose a corresponding package. And this is another reason why I think Nancy is really cool, it offers you to be able to host four different projects at the moment, ASP.NET, self host, WCF and Owin. To start with, I would just demo with ASP.NET. Use the nuget command to get the two essential packages.

Install-Package Nancy
Install-Package Nancy.Hosting.Aspnet

And then you will find some smurfs have already help you populate the web.config.

Bear in mind, you will get into trouble if you don't have the hosting handlers declared in the config.

Following the setup, here comes the mirror and smoke.


IMPORTANT:

There is a trap that can easily lure self-declared smartass like me in. DO NOT NAME YOUR PROJECT USING THE NAME 'NANCY'. Because after all the effort you've put it, what you would get is this message of discontent: 

Could not load type 'Nancy.Bootstrapper.INancyBootstrapper' from assembly 'Nancy, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null


That is definitely not what you want after joyfully try out some so-call easy peasy framework, and it makes absolutely no sense whatsoever at all. Reason is simple, JIT will confuse the type you try to load is originated from the project you created, and of course it will never find anything!

Since some people (like me) just have to learn it from the hard way, and googling is not much help in this case, I hope I can get someone out of his misery at least.

Friday, 28 March 2014

vi Cheat Sheet

Navigating File
asdf
w forward one wordb back to start of word
0 start of line$ end of line
nG go to line nG go to last line
Inserting Text (§) ESC to finish
i insert before cursor (§)A append at the end of line (§)
o open new line below (§)C change to end of line (§)
O open new line above (§)D delete to end of line
Editing:1 char1 wordn words1 linen lines
deletexdwndwddndd
changercw(§)ncw(§)cc(§)ncc(§)
copy (yank) into bufferywnywyynyy
p paste buffer after/below cursorP paste buffer before/above cursor
delete -> move cursor -> paste = cut & paste
copy -> move cursor -> paste = duplicate
Searching for Text:
/pattern search for patternn repeat last search
:n,ms/old/new/ between lines n and m substitute old with new
Miscellaneous
. repeat last edit commandu undo last edit command
>> indent lineU undo all changes to current line
J join lines^L redraw screen
Writing the file/Quitting vi
:r file read contents of file and place after current line
:n,mw file write portion of file between lines n and m to file
:w save:w file save to file:w! force save (often as root)
:q quit:wq save & quit:q! force quit (no save)
Escaping into shell
:!ksh run interative shell; press ^D to return to vi
:!cmd run cmd; press <CR> to return to vi
!!cmd run cmd; put its output into file being edited

Wednesday, 5 February 2014

A Short Guide to AWK

AWK is built to process column-oriented text data, such as tables. In which a file is considered to consist of N records (rows) by M fields (columns)

Basic

# awk < file ‘{print $2}’

# awk ‘{print $2}’ file

file = input file
print $2 = 2nd field of the line, awk has whitespace as the default delimiter

Delimiter

# awk -F: ‘{print $2}’ file

-F: = use ‘:’ as delimiter

# awk -F ‘[:;]’ ‘{print $2}’ file

-F ‘[:;]’ = use multiple delimiter, and parse using EITHER ‘:’ OR ‘;’

# awk -F ‘:;’ ‘{print $2}’ file

 -F ‘:;’ = use multiple delimiter, and parse using ':;’ as THE delimiter

# awk -F ‘:’+ ‘{print $2}’ file

 -F ‘:’+ = use multiple delimiter, to match any number ':'

Arithmetic

# echo 5 4 | awk '{print $1 + $2}'

output is '9', the result of ‘+’ (works as addition)

# echo 5 4 | awk '{print $1 $2}'

output is '54', to get string concatenation

echo 5 4 | awk '{print $1, $2}'

output is '5 4', to get value of 1st and 2nd field

Variables

# awk -F ‘<FS>’ ‘{print $2}’ file

<FS>, aka field separator variable, can consist of any single character or regular expression 
e.g. awk -F ‘:’ ‘{print $3}’ /etc/passwd

# awk -F ‘<FS>’ ‘BEGIN{OFS=“<OFS>”} {print $3, $4}’ file

<OFS>, aka output field separator variable, is the value inserted input separated output
e.g. awk -F ‘:’ ‘BEGIN{OFS=“|||”} {print $3, $4}’ /etc/passwd

# awk ‘BEGIN {RS=‘<RS>’} {print $1}’ file

<RS>, aka row separator variable, it works the same as FS, but vertically

# awk ‘BEGIN{ORS=“<OFS>”} {print $3, $4}’ file

<ORS>, aka output row separator variable, it works the similar way as OFS

# awk ‘{print NR}’ file

NR, aka number of records, is equivalent to line number
it is quite helpful when calculating average

# awk ‘{print NF}’ file

NF, aka number of fields, uses whitespace as delimiter and returns the field number
the value will change when delimiter is redefined with ‘-F’

Note: awk ‘{print $NF}’ file

this will print out the last field of the line instead of number of fields, the similar usage is $0, which prints out the line

# awk ‘{print FILENAME}’ file

it prints out the file name

# awk ‘{print FILENAME, FNR}’ file1 file2

FNR, aka number of records for each input file, will give number of records depends on file specified

Monday, 3 February 2014

How to: Upgrade SVN to 1.7 on CentOS

After searching through the web, I managed to find bits and pieces about upgrading subversion from 1.6 to 1.7 on CentOS. Unfortunately, it is like unraveling a puzzle, every blog starts from a slightly different corner and came across different blocks. I was strapped in perpetual trying and failing because of this. So I thought I better summarise and share my whole experience, along with the pitfalls while trying to work it out.


The upgrade, in a nutshell, is a process of uninstalling the current version of svn, and download, compile and install the target version.

Prerequisites

CentOS, SVN 1.6


Recipes and Tasting

1. Remove current svn

That's right! You heard me, be brave and remove svn using:
# yum remove subversion

2. Download target version of svn


Go to the following location:
# cd /usr/local/src/

And download the the target version of svn using:
# wget http://apache.mirrors.timporter.net/subversion/subversion-1.7.16.tar.gz

Surely you might come across problem like, you can't download the specific version because it does not exist. Fear not, just go to this mirror, and find the version you want and wget it again.

3. Unzip svn

Unzip the svn using:
# tar zxf subversion-1.7.16.tar.gz 

And navigate into the directory:
# mv subversion-1.7.16 subversion
# cd subversion

4. Download dependencies for svn

There are two packages needed by svn, namely apr and apr-util, before svn can be complied. Download and unzip them in the directory as previously navigated to.

tar zxf apr-1.5.0.tar.gz
tar zxf apr-util-1.5.3.tar.gz

Of course, go to this mirror to find a specific version, or the listed version does not exist anymore.

5. Rename dependencies directory

mv apr-1.5.0 apr
mv apr-util-1.5.3 apr-util

6. Configure svn

By running:

./configure

6.1 Oops, you might be missing sqlite dependency

The steps have been straight forward so far, yet configure may fail because of the following reason:

configure: checking sqlite library
checking sqlite amalgamation... no
checking sqlite3.h usability... no
checking sqlite3.h presence... no
checking for sqlite3.h... no
checking sqlite library version (via pkg-config)... no

An appropriate version of sqlite could not be found.  We recommmend
3.7.6.3, but require at least 3.6.18.
Please either install a newer sqlite on this system

or

get the sqlite 3.7.6.3 amalgamation from:
unpack the archive using tar/gunzip and copy sqlite3.c from the
resulting directory to:
/usr/local/src/subversion-1.7.14/sqlite-amalgamation/sqlite3.c

configure: error: Subversion requires SQLite

Now have a look at the sqlite version:

# sqlite3 -version

Mine comes up as '3.3.6', which is well out of date apparently. If you are sure you've got the correct version, try ./configure again, on success go to 7, otherwise jump to 6.2.

Now the sqlite has to be downloaded and upgraded:

yum --enablerepo=atomic upgrade sqlite

Even though this step does work quite well for me, keep on trying to find the correct package if the link let you down. And check the sqlite version again, 3.7.0.1, it says. Hooray.

6.2 Yet another Oops, you might need to get sqlite dependency manually

Sometimes, even if everything seems to be correct, it just won't work. Just like my case, which I have got the correct version of sqlite, ./configure just decided not to work! That leaves me the second option: to get the sqlite files manually, and put it in the specific directory just for svn configuration's sake:

unzip sqlite-amalgamation-3080200.zip
mv sqlite-amalgamation-3080200 sqlite-amalgamation

7. Compile and install svn

Until now, ./configure should run smoothly. Otherwise, I seriously have no idea what you are up against, so go cry for help elsewhere!

It is then a matter of compiling and installing it:

make
make install

Hmm, might as well go grab a cup of tea while waiting for it to compile. What's worse that could happen right :D

8. Check svn version and happy upgrading the project

cd <project directory>
svn upgrade

9. Certificate

If unfortunately, the svn server uses HTTP/HTTPS protocol, a specific module 'serf' has to be added to svn, and possibly a P12 certificate has to be assigned (for HTTPS).

The 'serf' can be downloaded from: 

# wget http://serf.googlecode.com/files/serf-1.2.1.zip

Unzip and move it to the default folder, and reinstall the project:

# unzip serf-1.2.1.zip
# mv serf-1.2.1 serf

Alternatively, specify it when reconfiguring subversion project:

# ./configure --with-serf=/usr/local/src/subversion/serf
# make
# make install

Even more annoyingly, the certificate needs to be configured in:

# cd ~/.subversion/
# vim servers

with the following information attached to the end of the file:

ssl-client-cert-file = <P12 unc>
ssl-client-cert-password = <***>

And that's given you have been given the details by your sysadmin. Therefore, I call it 'unfortunate', and I should follow it up once I manage to crack that part of the world :D

Reference: