Language Resources

The Latest Languages Topics

New in JAX-RS 2.0 – @BeanParam Annotation

JAX-RS 2.0 is the latest version of the JSR 311 specification and it was released along with Java EE 7.

July 22, 2014

by Abhishek Gupta

CORE

· 23,014 Views · 2 Likes

5 Reasons to Use a Java Data Grid in Your Application

In this post we explore 5 reasons to use a Java Data Grid for caching Java objects in-memory in your applications. In a later post we will explore some of the other data grid capabilities, beyond data storage, that can revolutionize your Java architectures, like on-grid computation and events. Memory is Fast Java Data Grids store Java objects in memory. Memory access is fast with low latency. So if access to data storage either disk or database is the primary bottleneck in your application then using a data grid as an in-memory cache in front of your storage tier will give you a performance boost. Scale out your Application Shared State If you need to share state across JVMs to scale out your application then using a Java Data Grid rather than a database will increase your scalability. A typical shared state architecture is shown below, the application server tier stores shared Java objects in the data grid and these objects are available to all application server nodes in your architecture. Separating the data grid tier from the application server tier has a number of advantages; Applications can be redeployed and restarted without losing the shared state Data Grid JVMs and Application JVMs can be tuned separately State can be shared across multiple different applications. Each tier can be scaled horizontally separately depending on work load Typical use cases for shared state include; PCI compliant storage of card security codes; In-game state in online games; web session data; prices and catalogues in ecommerce. Anything that needs low latency access can be stored in the shared data grid. High Availability for In-Memory Data As well as low latency access and scaling out shared state. Java Data Grids also provide high availability for your in-memory data. When storing Java objects in a data grid a primary object is stored in one of the Data Grid JVMs and secondary back up copies of the object are stored in different Data Grid JVM node, ensuring that if you lose a node then you don't lose any data. Clients of the data grid do not need to know where data is to access it so high availability is transparent to your application. Scale Out In-Memory Data Volumes Java objects, in data grids, aren't fully replicated across all Data Grid JVMs but are stored as a primary object and a secondary object. This means the more Data Grid JVM nodes we add the more JVM heap we have for storing Java objects in-memory (and remember memory is fast). For example if we build a Data Grid with 20 JVMs each with 4Gb free heap (after per JVM overhead) we could theoretically store 80Gb (4 times 20) of shared Java objects. If we assume we have 1 duplicate for high availability this cuts our storage in half so we can store 40Gb (.5 time 4 times 20 ) of Java Objects in memory. Native Integration with JPA Java Data Grids have native integration with JPA frameworks like TopLink and Hibernate whereby the Data Grid can act as a second level cache between JPA and the database. This can give a large performance boost to your database driven application if latency associated with database access is a key performance bottleneck.

July 22, 2014

by Steve Millidge

· 7,031 Views

R: ggplot: Problem automatically picking scale for difftime object

While reading ‘Why The R Programming Language Is Good For Business‘ I came across Udacity’s ‘Data Analysis with R‘ courses – part of which focuses exploring data sets using visualisations, something I haven’t done much of yet. I thought it’d be interesting to create some visualisations around the times that people RSVP ‘yes’ to the various Neo4j events that we run in London. I started off with the following query which returns the date time that people replied ‘Yes’ to an event and the date time of the event: library(Rneo4j) query = "MATCH (e:Event)<-[:TO]-(response {response: 'yes'}) RETURN response.time AS time, e.time + e.utc_offset AS eventTime" allYesRSVPs = cypher(graph, query) allYesRSVPs$time = timestampToDate(allYesRSVPs$time) allYesRSVPs$eventTime = timestampToDate(allYesRSVPs$eventTime) > allYesRSVPs[1:10,] time eventTime 1 2011-06-05 12:12:27 2011-06-29 18:30:00 2 2011-06-05 14:49:04 2011-06-29 18:30:00 3 2011-06-10 11:22:47 2011-06-29 18:30:00 4 2011-06-07 15:27:07 2011-06-29 18:30:00 5 2011-06-06 20:21:45 2011-06-29 18:30:00 6 2011-07-04 19:49:04 2011-07-27 19:00:00 7 2011-07-05 16:40:10 2011-07-27 19:00:00 8 2011-08-19 07:41:10 2011-08-31 18:30:00 9 2011-08-24 12:47:40 2011-08-31 18:30:00 10 2011-08-18 09:56:53 2011-08-31 18:30:00 I wanted to create a bar chart showing the amount of time in advance of a meetup that people RSVP’d ‘yes’ so I added the following column to my data frame: allYesRSVPs$difference = allYesRSVPs$eventTime - allYesRSVPs$time > allYesRSVPs[1:10,] time eventTime difference 1 2011-06-05 12:12:27 2011-06-29 18:30:00 34937.55 mins 2 2011-06-05 14:49:04 2011-06-29 18:30:00 34780.93 mins 3 2011-06-10 11:22:47 2011-06-29 18:30:00 27787.22 mins 4 2011-06-07 15:27:07 2011-06-29 18:30:00 31862.88 mins 5 2011-06-06 20:21:45 2011-06-29 18:30:00 33008.25 mins 6 2011-07-04 19:49:04 2011-07-27 19:00:00 33070.93 mins 7 2011-07-05 16:40:10 2011-07-27 19:00:00 31819.83 mins 8 2011-08-19 07:41:10 2011-08-31 18:30:00 17928.83 mins 9 2011-08-24 12:47:40 2011-08-31 18:30:00 10422.33 mins 10 2011-08-18 09:56:53 2011-08-31 18:30:00 19233.12 mins I then tried to use ggplot to create a bar chart of that data: > ggplot(allYesRSVPs, aes(x=difference)) + geom_histogram(binwidth=1, fill="green") Unfortunately that resulted in this error: Don't know how to automatically pick scale for object of type difftime. Defaulting to continuous Error: Discrete value supplied to continuous scale I couldn’t find anyone who had come across this problem before in my search but I did find the as.numeric function which seemed like it would put the difference into an appropriate format: allYesRSVPs$difference = as.numeric(allYesRSVPs$eventTime - allYesRSVPs$time, units="days") > ggplot(allYesRSVPs, aes(x=difference)) + geom_histogram(binwidth=1, fill="green") that resulted in the following chart: We can see there is quite a heavy concentration of people RSVPing yes in the few days before the event and then the rest are scattered across the first 30 days. We usually announce events 3/4 weeks in advance so I don’t know that it tells us anything interesting other than that it seems like people sign up for events when an email is sent out about them. The date the meetup was announced (by email) isn’t currently exposed by the API but hopefully one day it will be. The code is on github if you want to have a play – any suggestions welcome.

July 21, 2014

by Mark Needham

· 11,477 Views

Grouping, Sampling and Batching - Custom Collectors in Java 8

Continuing from the first article, this time we will write some more useful custom collectors: for grouping by given criteria, sampling input, batching and sliding over with fixed size window. Grouping (counting occurrences, histogram) Imagine you have a collection of some items and you want to calculate how many times each item (with respect to equals()) appears in this collection. This can be achieved using CollectionUtils.getCardinalityMap() from Apache Commons Collections. This method takes an Iterable and returns Map, counting how many times each item appeared in the collection. However sometimes instead of usingequals() we would like to group by an arbitrary attribute of input T. For example say we have a list of Person objects and we would like to compute the number of males vs. females (i.e. Map) or maybe an age distribution. There is a built-in collector Collectors.groupingBy(Function classifier) - however it returns a map from key to all items mapped to that key. See: import static java.util.stream.Collectors.groupingBy; //... final List people = //... final Map> bySex = people .stream() .collect(groupingBy(Person::getSex)); It's valuable, but in our case unnecessarily builds two List. I only want to know the number of people. There is no such collector built-in, but we can compose it in a fairly simple manner: import static java.util.stream.Collectors.counting; import static java.util.stream.Collectors.groupingBy; //... final Map bySex = people .stream() .collect( groupingBy(Person::getSex, HashMap::new, counting())); This overloaded version of groupingBy() takes three parameters. First one is the key (classifier) function, as previously. Second argument creates a new map, we'll see shortly why it's useful. counting() is a nested collector that takes all people with same sex and combines them together - in our case simply counting them as they arrive. Being able to choose map implementation is useful e.g. when building age histogram. We would like to know how many people we have at given age - but age values should be sorted: final TreeMap byAge = people .stream() .collect( groupingBy(Person::getAge, TreeMap::new, counting())); byAge .forEach((age, count) -> System.out.println(age + ":\t" + count)); We ended up with a TreeMap from age (sorted) to count of people having that age. Sampling, batching and sliding window IterableLike.sliding() method in Scala allows to view a collection through a sliding fixed-size window. This window starts at the beginning and in each iteration moves by given number of items. Such functionality, missing in Java 8, allows several useful operators like computing moving average, splitting big collection into batches (compare with Lists.partition() in Guava) or sampling every n-th element. We will implement collector for Java 8 providing similar behaviour. Let's start from unit tests, which should describe briefly what we want to achieve: import static com.nurkiewicz.CustomCollectors.sliding @Unroll class CustomCollectorsSpec extends Specification { def "Sliding window of #input with size #size and step of 1 is #output"() { expect: input.stream().collect(sliding(size)) == output where: input | size | output [] | 5 | [] [1] | 1 | [[1]] [1, 2] | 1 | [[1], [2]] [1, 2] | 2 | [[1, 2]] [1, 2] | 3 | [[1, 2]] 1..3 | 3 | [[1, 2, 3]] 1..4 | 2 | [[1, 2], [2, 3], [3, 4]] 1..4 | 3 | [[1, 2, 3], [2, 3, 4]] 1..7 | 3 | [[1, 2, 3], [2, 3, 4], [3, 4, 5], [4, 5, 6], [5, 6, 7]] 1..7 | 6 | [1..6, 2..7] } def "Sliding window of #input with size #size and no overlapping is #output"() { expect: input.stream().collect(sliding(size, size)) == output where: input | size | output [] | 5 | [] 1..3 | 2 | [[1, 2], [3]] 1..4 | 4 | [1..4] 1..4 | 5 | [1..4] 1..7 | 3 | [1..3, 4..6, [7]] 1..6 | 2 | [[1, 2], [3, 4], [5, 6]] } def "Sliding window of #input with size #size and some overlapping is #output"() { expect: input.stream().collect(sliding(size, 2)) == output where: input | size | output [] | 5 | [] 1..4 | 5 | [[1, 2, 3, 4]] 1..7 | 3 | [1..3, 3..5, 5..7] 1..6 | 4 | [1..4, 3..6] 1..9 | 4 | [1..4, 3..6, 5..8, 7..9] 1..10 | 4 | [1..4, 3..6, 5..8, 7..10] 1..11 | 4 | [1..4, 3..6, 5..8, 7..10, 9..11] } def "Sliding window of #input with size #size and gap of #gap is #output"() { expect: input.stream().collect(sliding(size, size + gap)) == output where: input | size | gap | output [] | 5 | 1 | [] 1..9 | 4 | 2 | [1..4, 7..9] 1..10 | 4 | 2 | [1..4, 7..10] 1..11 | 4 | 2 | [1..4, 7..10] 1..12 | 4 | 2 | [1..4, 7..10] 1..13 | 4 | 2 | [1..4, 7..10, [13]] 1..13 | 5 | 1 | [1..5, 7..11, [13]] 1..12 | 5 | 3 | [1..5, 9..12] 1..13 | 5 | 3 | [1..5, 9..13] } def "Sampling #input taking every #nth th element is #output"() { expect: input.stream().collect(sliding(1, nth)) == output where: input | nth | output [] | 1 | [] [] | 5 | [] 1..3 | 5 | [[1]] 1..6 | 2 | [[1], [3], [5]] 1..10 | 5 | [[1], [6]] 1..100 | 30 | [[1], [31], [61], [91]] } } Using data driven tests in Spock I managed to write almost 40 test cases in no-time, succinctly describing all requirements. I hope these are clear for you, even if you haven't seen this syntax before. I already assumed existence of handy factory methods: public class CustomCollectors { public static Collector>> sliding(int size) { return new SlidingCollector<>(size, 1); } public static Collector>> sliding(int size, int step) { return new SlidingCollector<>(size, step); } } The fact that collectors receive items one after another makes are job harder. Of course first collecting the whole list and sliding over it would have been easier, but sort of wasteful. Let's build result iteratively. I am not even pretending this task can be parallelized in general, so I'll leave combiner() unimplemented: public class SlidingCollector implements Collector>, List>> { private final int size; private final int step; private final int window; private final Queue buffer = new ArrayDeque<>(); private int totalIn = 0; public SlidingCollector(int size, int step) { this.size = size; this.step = step; this.window = max(size, step); } @Override public Supplier>> supplier() { return ArrayList::new; } @Override public BiConsumer>, T> accumulator() { return (lists, t) -> { buffer.offer(t); ++totalIn; if (buffer.size() == window) { dumpCurrent(lists); shiftBy(step); } }; } @Override public Function>, List>> finisher() { return lists -> { if (!buffer.isEmpty()) { final int totalOut = estimateTotalOut(); if (totalOut > lists.size()) { dumpCurrent(lists); } } return lists; }; } private int estimateTotalOut() { return max(0, (totalIn + step - size - 1) / step) + 1; } private void dumpCurrent(List> lists) { final List batch = buffer.stream().limit(size).collect(toList()); lists.add(batch); } private void shiftBy(int by) { for (int i = 0; i < by; i++) { buffer.remove(); } } @Override public BinaryOperator>> combiner() { return (l1, l2) -> { throw new UnsupportedOperationException("Combining not possible"); }; } @Override public Set characteristics() { return EnumSet.noneOf(Characteristics.class); } } I spent quite some time writing this implementation, especially correct finisher() so don't be frightened. The crucial part is a buffer that collects items until it can form one sliding window. Then "oldest" items are discarded and window slides forward by step. I am not particularly happy with this implementation, but tests are passing. sliding(N)(synonym to sliding(N, 1)) will allow calculating moving average of N items.sliding(N, N) splits input into batches of size N. sliding(1, N) takes every N-th element (samples). I hope you'll find this collector useful, enjoy!

July 18, 2014

by Tomasz Nurkiewicz

CORE

· 21,370 Views · 2 Likes

Retrieving JMX information programmatically

Retrieving JMX information for a Java process is very easy when using a tool such as JConsole or JVisualVM. These provide an interface that allows viewing of information such as CPU usage, memory usage, threads active and more. This blog post gives an example of how to retrieve such information programmatically. In order to retrieve JMX information from a Java application, the target application must be configured to expose JMX information. This link shows how to do this. As an example, we shall be retrieving CPU and memory usage from a standalone Mule instance. In Mule, the JMX agent may be configured from a Mule configuration file. Through this, one may set the address that a JMX client can use to retrieve information; this is how. The following Java code allows for polling the JMX agent and retrieving memory, CPU usage and also shows how to remotely invoke the garbage collector: // create jmx connection with mules jmx agent JMXServiceURL url = new JMXServiceURL("service:jmx:rmi:///jndi/rmi://localhost:1098/server"); JMXConnector jmxc = JMXConnectorFactory.connect(url, null); jmxc.connect(); //create object instances that will be used to get memory and operating system Mbean objects exposed by JMX; create variables for cpu time and system time before Object memoryMbean = null; Object osMbean = null; long cpuBefore = 0; long tempMemory = 0; CompositeData cd = null; cpuBefore = Long.parseLong(a.toString()); // call the garbage collector before the test using the Memory Mbean jmxc.getMBeanServerConnection().invoke(new ObjectName("java.lang:type=Memory"), "gc", null, null); //create a loop to get values every second (optional) for (int i = 0; i < samplesCount; i++) { //get an instance of the HeapMemoryUsage Mbean memoryMbean = jmxc.getMBeanServerConnection().getAttribute(new ObjectName("java.lang:type=Memory"), "HeapMemoryUsage"); cd = (CompositeData) memoryMbean; //get an instance of the OperatingSystem Mbean osMbean = jmxc.getMBeanServerConnection().getAttribute(new ObjectName("java.lang:type=OperatingSystem"),"ProcessCpuTime"); System.out.println("Used memory: " + " " + cd.get("used") + " Used cpu: " + osMbean); //print memory usage tempMemory = tempMemory + Long.parseLong(cd.get("used").toString()); Thread.sleep(1000); //delay for one second } //get system time and cpu time from last poll long cpuAfter = Long.parseLong(osMbean.toString()); long cpuDiff = cpuAfter - cpuBefore; //find cpu time between our first and last jmx poll System.out.println("Cpu diff in milli seconds: " + cpuDiff / 1000000); //print cpu time in miliseconds System.out.println("average memory usage is: " + tempMemory / samplesCount);//print average memory usage The above example prints: ... Used memory: 23376624 Used cpu: 38060000000 Used memory: 24020624 Used cpu: 38080000000 Used memory: 24621920 Used cpu: 38090000000 Cpu diff in milli seconds: 4230 average memory usage is: 28028204 When the JMX agent may not be enabled for a Java process, it is also possible to retrieve information by fetching the process by id, for example. The following code shows how to do this using Sigar API: //create a sigar object Sigar sigar = new Sigar(); for (int i = 0; i < 100; i++) { ProcessFinder find = new ProcessFinder(sigar); //get the list of current java processes, and optionally query the list to choose which process to monitor long[] pidList = sigar.getProcList(); //assuming we know the process id, we may query the process finder long pid = find.findSingleProcess("Pid.Pid.eq=54730"); //get memory info for the process id ProcMem memory = new ProcMem(); memory.gather(sigar, pid); //get cou info for the oricess id ProcCpu cpu = new ProcCpu(); cpu.gather(sigar, pid); //print the memory used by the process id System.out.println("Current memory used: " + Long.toString(memory.getSize())); //print all memory info System.out.println(memory.toMap()); //print all cpu info System.out.println(cpu.toMap()); Thread.sleep(1000); } This is displayed when running the above example: Current memory used: 3257659392 {Resident=258789376, PageFaults=109613, Size=3257659392} {User=34973, LastTime=1404467787774, Percent=0.0, StartTime=1404467383826, Total=38121, Sys=3148}

July 16, 2014

by Gabriel Dimech

· 31,054 Views

The Observer Pattern in Java

Get an overview of the Java observer pattern using an inventory example.

July 16, 2014

by Roohi Agrawala

· 112,690 Views · 27 Likes

The Java Origins of Angular JS: Angular vs JSF vs GWT

Get familiar with the Angular JS origin story.

July 15, 2014

by Vasco Cavalheiro

· 85,279 Views · 5 Likes

Spring Security Run-As example using annotations and namespace configuration

Spring Security offers an authentication replacement feature, often referred to as Run-As, that can replace the current user's authentication (and thus permissions) during a single secured object invocation. Using this feature makes sense when a backend system invoked during request processing requires different privileges than the current application. For example, an application might want to expose a financial transaction log to the currently logged in user, but the backend system that provides it only permits this action to the members of a special "auditor" role. The application can not simply assign this role to the user as that would potentially permit them to execute other restricted actions. Instead, the user can be given this right exclusively for viewing their transaction log. Only two classes are used to implement this feature. Instances of RunAsManager are tasked with producing the actual replacement authentication tokens. A sensible default implementation is already provided by Spring Security. As with other types of authentication, it is also necessary to register an instance of an appropriate AuthenticationProvider. Tokens produced by runAsManager are signed with the provided key (my_run_as_key in the example above) and are later checked against the same key by runAsAuthenticationProvider, in order to mitigate the risk of fake tokens being provided. These keys can have any value, but need to be the same in both objects. Otherwise, runAsAuthenticationProvider will reject the produced tokens as invalid. If an instance is registered, RunAsManager will be invoked by AbstractSecurityInterceptor for every intercepted object invocation for which the user has already been given access. If RunAsManager returns a token, this token will be used be used instead of the original one for the duration of the invocation, thus granting the user different privileges. There are two key points here. In order for the authentication replacement feature to do anything, the call has to actually be secured (and thus intercepted), and the user has to already have been granted access. To register a RunAsManager instance with the method security interceptor, something similar to the following is needed: Now, all methods secured by the @Secured annotation will be able to trigger RunAsManager. One important point here is that global-method-security will only work in the Spring context in which it is defined. In Spring MVC applications, there usually are two Spring contexts: the parent context, attached to ContextLoaderListener, and the child context, attached toDispatcherServlet. To secure Controller methods in this way, global-method-security must be added to DispatcherServlet's context. To secure methods in beans not in this context, global-method-security should also be added to ContextLoaderListener's context. Otherwise, security annotations will be ignored. The default implementation of RunAsManager (RunAsManagerImpl) will inspect the secured object's configuration and if it finds any attributes prefixed with RUN_AS_, it will create a token identical to the original, with the addition of one new GrantedAuthorty per RUN_AS_ attribute found. The new GrantedAuthority will be a role (prefixed by ROLE_ by default) named like the found attribute without the RUN_AS_ prefix. So, if a user with a role ROLE_REGISTERED_USER invokes a method annotated with @Secured({"ROLE_REGISTERED_USER","RUN_AS_AUDITOR"}), e.g. @Controller public class TransactionLogController { @Secured({"ROLE_REGISTERED_USER","RUN_AS_AUDITOR"}) //Authorities needed for method access and authorities added by RunAsManager prefixed with RUN_AS_ @RequestMapping(value = "/transactions", method = RequestMethod.GET) //Spring MVC configuration. Not related to security @ResponseBody //Spring MVC configuration. Not related to security public List getTransactionLog(...) { ... //Invoke something in the backend requiring ROLE_AUDITOR { ... //User does not have ROLE_AUDITOR here } the resulting token created by RunAsManagerImpl with be granted ROLE_REGISTERED_USER and ROLE_AUDITOR. Thus, the user will also be allowed actions, normally reserved for ROLE_AUDITOR members, during the current invocation, permitting them, in this case, to access the transaction log.To enable runAsAuthenticationProvider, register it as usual: ... other authentication-providers used by the application ... This is all that is necessary to have the default implementation activated. Still, this setting will not work for methods secured by @PreAuthorize and @PostAuthorize annotations as their configuration attributes are differently evaluated (they are SpEL expressions and not a simple list or required authorities like with @Secured) and will not be recognized by RunAsManagerImpl. For this scenario to work, a custom RunAsManager implementation is required, as, at least at the time of writing, no applicable implementation is provided by Spring. A custom RunAsManager implementation for use with @PreAuthorize/@PostAuthorize A convenient implementation relying on a custom annotation is provided below: public class AnnotationDrivenRunAsManager extends RunAsManagerImpl { @Override public Authentication buildRunAs(Authentication authentication, Object object, Collection attributes) { if(!(object instanceof ReflectiveMethodInvocation) || ((ReflectiveMethodInvocation)object).getMethod().getAnnotation(RunAsRole.class) == null) { return super.buildRunAs(authentication, object, attributes); } String roleName = ((ReflectiveMethodInvocation)object).getMethod().getAnnotation(RunAsRole.class).value(); if (roleName == null || roleName.isEmpty()) { return null; } GrantedAuthority runAsAuthority = new SimpleGrantedAuthority(roleName); List newAuthorities = new ArrayList(); // Add existing authorities newAuthorities.addAll(authentication.getAuthorities()); // Add the new run-as authority newAuthorities.add(runAsAuthority); return new RunAsUserToken(getKey(), authentication.getPrincipal(), authentication.getCredentials(), newAuthorities, authentication.getClass()); } } This implementation will look for a custom @RunAsRole annotation on a protected method (e.g. @RunAsRole("ROLE_AUDITOR")) and, if found, will add the given authority (ROLE_AUDITOR in this case) to the list of granted authorities. RunAsRole itself is just a simple custom annotation: @Retention(RetentionPolicy.RUNTIME) @Target(ElementType.METHOD) public @interface RunAsRole { String value(); } This new implementation would be instantiated in the same way as before: And registered in a similar fashion: The expression-handler is always required for pre-post-annotations to work. It is a part of the standard Spring Security configuration, and not related to the topic described here. Both pre-post-annotations and secured-annotations can be enabled at the same time, but should never be used in the same class. The protected controller method from above could now look like this: @Controller public class TransactionLogController { @PreAuthorize("hasRole('ROLE_REGISTERED_USER')") //Authority needed to access the method @RunAsRole("ROLE_AUDITOR") //Authority added by RunAsManager @RequestMapping(value = "/transactions", method = RequestMethod.GET) //Spring MVC configuration. Not related to security @ResponseBody //Spring MVC configuration. Not related to security public List getTransactionLog(...) { ... //Invoke something in the backend requiring ROLE_AUDITOR { ... //User does not have ROLE_AUDITOR here }

July 7, 2014

by Bojan Tomić

· 22,596 Views · 1 Like

Dynamically Create CSS Classes With SASS

There are many advantages to using CSS pre-processors like SASS, some of the features allow you to end up writing less CSS code by using inheritance and functions in SASS to reuse the same code on your different CSS classes and IDs. To learn more about getting started with SASS you can refer to a previous articles. Getting started with SASS One of my favourite features of SASS is the ability to use loops to dynamically create your CSS classes. A good example of this is when you want to make a set of classes to use for changing the text colours and background colours of elements you would normally have to write CSS like this. .red-background { background: #FF0000; } .red-color { color: #FF0000; } .blue-background { background: #001EFF; } .blue-color { color: #001EFF; } .green-background { background: #00FF00; } .green-color { color: #00FF00; } .yellow-background { background: #F6FF00; } .yellow-color { color: #F6FF00; } If you want to add additional colours to this later you will have to remember to write both background and colour classes. With SASS we can create a list of our colours and then loop through these to create the CSS classes. To create a list in SASS all you have to do is create a comma separated list of key value pairs like the following. $colours: "red" #FF0000, "blue" #001EFF, "green" #00FF00, "yellow" #F6FF00; Using the @each keyword in SASS we can loop through each of the colours and then use the nth() function to get the name of the class and the value of the class to dynamically create the classes in our CSS. The following each loop will generate exactly the same colour classes as above with only a few lines of code. @each $i in $colours{ .#{nth($i, 1)}-background { background: nth($i, 2); } .#{nth($i, 1)}-color { color:nth($i, 2); } }

July 4, 2014

by Paul Underwood

· 15,707 Views

Spring Integration Java DSL sample - Further Simplification With JMS Namespace Factories

In an earlier blog entry I had touched on a fictitious rube goldberg flow for capitalizing a string through a complicated series of steps, the premise of the article was to introduce Spring Integration Java DSL as an alternative to defining integration flows through xml configuration files. I learned a few new things after writing that blog entry, thanks to Artem Bilan and wanted to document those learnings here: So, first my original sample, here I have the following flow(the one's in bold): Take in a message of this type - "hello from spring integ" Split it up into individual words(hello, from, spring, integ) Send each word to a ActiveMQ queue Pick up the word fragments from the queue and capitalize each word Place the response back into a response queue Pick up the message, re-sequence based on the original sequence of the words Aggregate back into a sentence("HELLO FROM SPRING INTEG") and Return the sentence back to the calling application. EchoFlowOutbound.java: @Bean public DirectChannel sequenceChannel() { return new DirectChannel(); } @Bean public DirectChannel requestChannel() { return new DirectChannel(); } @Bean public IntegrationFlow toOutboundQueueFlow() { return IntegrationFlows.from(requestChannel()) .split(s -> s.applySequence(true).get().getT2().setDelimiters("\\s")) .handle(jmsOutboundGateway()) .get(); } @Bean public IntegrationFlow flowOnReturnOfMessage() { return IntegrationFlows.from(sequenceChannel()) .resequence() .aggregate(aggregate -> aggregate.outputProcessor(g -> Joiner.on(" ").join(g.getMessages() .stream() .map(m -> (String) m.getPayload()).collect(toList()))) , null) .get(); } @Bean public JmsOutboundGateway jmsOutboundGateway() { JmsOutboundGateway jmsOutboundGateway = new JmsOutboundGateway(); jmsOutboundGateway.setConnectionFactory(this.connectionFactory); jmsOutboundGateway.setRequestDestinationName("amq.outbound"); jmsOutboundGateway.setReplyChannel(sequenceChannel()); return jmsOutboundGateway; } It turns out, based on Artem Bilan's feedback, that a few things can be optimized here. First notice how I have explicitly defined two direct channels, "requestChannel" for starting the flow that takes in the string message and the "sequenceChannel" to handle the message once it returns back from the jms message queue, these can actually be totally removed and the flow made a little more concise this way: @Bean public IntegrationFlow toOutboundQueueFlow() { return IntegrationFlows.from("requestChannel") .split(s -> s.applySequence(true).get().getT2().setDelimiters("\\s")) .handle(jmsOutboundGateway()) .resequence() .aggregate(aggregate -> aggregate.outputProcessor(g -> Joiner.on(" ").join(g.getMessages() .stream() .map(m -> (String) m.getPayload()).collect(toList()))) , null) .get(); } @Bean public JmsOutboundGateway jmsOutboundGateway() { JmsOutboundGateway jmsOutboundGateway = new JmsOutboundGateway(); jmsOutboundGateway.setConnectionFactory(this.connectionFactory); jmsOutboundGateway.setRequestDestinationName("amq.outbound"); return jmsOutboundGateway; } "requestChannel" is now being implicitly created just by declaring a name for it. The sequence channel is more interesting, quoting Artem Bilan - do not specify outputChannel for AbstractReplyProducingMessageHandler and rely on DSL , what it means is that here jmsOutboundGateway is a AbstractReplyProducingMessageHandler and its reply channel is implicitly derived by the DSL. Further, two methods which were earlier handling the flows for sending out the message to the queue and then continuing once the message is back, is collapsed into one. And IMHO it does read a little better because of this change. The second good change and the topic of this article is the introduction of the Jms namespace factories, when I had written the previous blog article, DSL had support for defining the AMQ inbound/outbound adapter/gateway, now there is support for Jms based inbound/adapter adapter/gateways also, this simplifies the flow even further, the flow now looks like this: @Bean public IntegrationFlow toOutboundQueueFlow() { return IntegrationFlows.from("requestChannel") .split(s -> s.applySequence(true).get().getT2().setDelimiters("\\s")) .handle(Jms.outboundGateway(connectionFactory) .requestDestination("amq.outbound")) .resequence() .aggregate(aggregate -> aggregate.outputProcessor(g -> Joiner.on(" ").join(g.getMessages() .stream() .map(m -> (String) m.getPayload()).collect(toList()))) , null) .get(); } The inbound Jms part of the flow also simplifies to the following: @Bean public IntegrationFlow inboundFlow() { return IntegrationFlows.from(Jms.inboundGateway(connectionFactory) .destination("amq.outbound")) .transform((String s) -> s.toUpperCase()) .get(); } Thus, to conclude, Spring Integration Java DSL is an exciting new way to concisely configure Spring Integration flows. It is already very impressive in how it simplifies the readability of flows, the introduction of the Jms namespace factories takes it even further for JMS based flows.

July 2, 2014

by Biju Kunjummen

· 17,352 Views

Reporting Back from MongoDB World 2014, NYC, Planet JSON

Closely approaching the one year mark of when I first joined MongoLab (and the MongoDB community), I had the pleasure of attending the inaugural MongoDB World conference put together by the incredible MongoDB team. Second only to the excitement around major MongoDB feature announcements was the collective disbelief that this was MongoDB’s first multi-day conference ever. A big congratulations to all those that worked hard to put on such a massive (did you see the Intrepid!?) event. All this planning would have been for naught if MongoDB leaders and engineers failed to deliver announcements and features that would meet and exceed expectations. From major public cloud announcements to the reveal of document-level locking in version 2.8, developers and conference goers had plenty to be excited about. There was a lot to digest from the conference… we’ll cover the major highlights in case you missed them. Big announcements in public cloud Our time at the MongoLab booth yielded many high-quality conversations, predominantly those about offloading previously internal processes and workloads to the public cloud. It was remarkable to see and hear so many enterprise teams with the exact same message: the public cloud is the future, and the future is now. It’s no surprise then that MongoDB, Inc. released not one, but two press releases around MongoDB solutions for the public cloud. Fully-managed MongoDB on the Microsoft Azure Store Nearly one year ago, MongoDB, Inc. chose to partner with the MongoLab team to build a production-ready MongoDB solution for developers on Microsoft Azure. On the first day of World, MongoDB, Inc. announced the product of our collaboration – a fully-managed highly available MongoDB-as-a-Service Add-On offering on the Microsoft Azure Store. This new service runs MongoDB Enterprise and offers replication, monitoring and support from MongoDB, Inc. It’s also backed by MongoDB Management Service (MMS), allowing for point-in-time recovery of MongoDB deployments. Now, teams without the expertise or resources to manage their MongoDB deployment(s) can outsource all the database operations (monitoring and alerting, backups, performance tuning, etc.) to both MongoLab and MongoDB’s expert support teams. You can check out the MongoDB add-on in the Azure Store: https://azure.microsoft.com/en-us/gallery/store/mongodb/mongodb-inc/ MongoDB solutions on Google Cloud Platform MongoDB, Inc. also announced the arrival of new resources to help Google Cloud Platform customers deploy MongoDB on Google Compute Engine. These resources include a “Click to Deploy” feature and a MongoDB on Google Compute Engine Solutions paper covering MongoDB best practices. If you are looking for a fully-managed solution, with automated provisioning, backups, integrated monitoring and alerting, along with expert support, MongoLab recently announced the arrival of production-ready replica sets on Google. Product Roadmap – MongoDB version 2.8 On the second day of MongoDB World, Eliot Horowitz, MongoDB, Inc. CTO & Co-founder, took center stage and announced two huge changes to the MongoDB core project: document-level locking and pluggable storage engines. These features not only reflect improvements to the core project, but also signal to the community that the MongoDB team is listening to its users and is capable of delivering the software needed to power the workloads of tomorrow. Document-level locking The slides above from Eliot’s keynote point to a current obstacle (database-level locking) in MongoDB that limits overall scalability. With database-level locking, any write operation to the database holds the write lock and prevents subsequent writes from executing on the database until the original operation holding the write lock completes. Eliot’s announcement of document-level locking moves the write lock contention from the database level to the document (MongoDB equivalent to SQL “records”) level. This change will allow users to achieve much higher write throughput (we saw a 10x performance improvement in the live demo) across their MongoDB deployments, improving write scalability. If you’d like to try out document-level locking, the MongoDB team has already pushed the feature to the master branch on GitHub. This should only be used for experimentation, not to be run in production. Pluggable storage engine As MongoDB matures, feature releases like document level locking will continue to allow developers to build robust systems on top of MongoDB. But as the number of use cases grows, different tooling tailored to specific use cases may prove to be extremely beneficial. For example, if Company X decides that they want to use MongoDB to warehouse some of their data, they would likely want to optimize their database for slow-moving data and storage efficiency (compression). With the introduction of pluggable storage engines, many new possibilities are open to the community. Teams can now write their own storage engine for a particular use case, configure replica set nodes with different storage engines for specific situations, or collaborate with the open-source community to architect innovative solutions. This feature not only allows for more granular control of the database, but also encourages the MongoDB community to work together. Takeaways: A maturing and thriving ecosystem Roughly a year ago, MongoLab CTO Todd Dampier recapped MongoSF 2013 and spoke to the health of the MongoDB ecosystem. How far we’ve come! After attending the inaugural MongoDB World and chatting with MongoDB Masters, interns, hackathon winners, power users and those new to the community, the enthusiasm is still surging and as positive as ever. This enthusiasm is well placed. Developers and hackers use MongoDB because so much rich data on the web is shared as JSON (think Facebook, Twitter, Google, etc.). As a result, MongoDB is the de-facto database for hackathons and bootstrapped projects. Just learn the API for the site you want to mine, throw the JSON in MongoDB and query your data with the rich query language- it’s that easy. The MongoDB ecosystem is maturing as well. Take a look at the Customer Success Stories and you’ll get a feel for the extent in which enterprises leverage the solution and use it in production. To further drive enterprise adoption, MongoDB, Inc.’s public cloud solutions and product roadmap features aim to help teams run MongoDB in production and give teams the confidence that MongoDB will continue to improve scalability and meet their growing project requirements. Congratulations again to the MongoDB team on their big announcements and for creating such a fantastic forum at which to learn and meet fellow MongoDB users. Our team at MongoLab had a great time making new friends and talking shop; we look forward to meeting more MongoDB users soon (at a MongoDB Days near you)! -Chris@MongoLab

July 2, 2014

by Chris Chang

· 6,141 Views

Why %util Number from Iostat is Meaningless for MySQL Capacity Planning

Earlier this month I wrote about vmstat iowait cpu numbers, and some of the comments I got were advertising the use of util% as reported by the iostat tool instead. I find this number even more useless for MySQL performance tuning and capacity planning. Now let me start by saying this is a really tricky and deceptive number. Many DBAs who report instances of their systems having a very busy IO subsystem said the util% in vmstat was above 99% and therefore they believe this number is a good indicator of an overloaded IO subsystem. Indeed – when your IO subsystem is busy, up to its full capacity, the utilization should be very close to 100%. However, it is perfectly possible for the IO subsystem and MySQL with it to have plenty more capacity than when utilization is showing 100% – as I will show in an example. Before that though lets see what the iostat manual page has to say on this topic – from this main page we can read: %util Percentage of CPU time during which I/O requests were issued to the device (bandwidth utilization for the device). Device saturation occurs when this value is close to 100% for devices serving requests serially. But for devices serving requests in parallel, such as RAID arrays and modern SSDs, this number does not reflect their performance limits. Which says right here that the number is useless for pretty much any production database server that is likely to be running RAID, Flash/SSD, SAN or cloud storage (such as EBS) capable of handling multiple requests in parallel. Let’s look at the following illustration. I will run sysbench on a system with a rather slow storage data size larger than buffer pool and uniform distribution to put pressure on the IO subsystem. I will use a read-only benchmark here as it keeps things more simple… sysbench –num-threads=1 –max-requests=0 –max-time=6000000 –report-interval=10 –test=oltp –oltp-read-only=on –db-driver=mysql –oltp-table-size=100000000 –oltp-dist-type=uniform –init-rng=on –mysql-user=root –mysql-password= run I’m seeing some 9 transactions per second, while disk utilization from iostat is at nearly 96%: [ 80s] threads: 1, tps: 9.30, reads/s: 130.20, writes/s: 0.00 response time: 171.82ms (95%) [ 90s] threads: 1, tps: 9.20, reads/s: 128.80, writes/s: 0.00 response time: 157.72ms (95%) [ 100s] threads: 1, tps: 9.00, reads/s: 126.00, writes/s: 0.00 response time: 215.38ms (95%) [ 110s] threads: 1, tps: 9.30, reads/s: 130.20, writes/s: 0.00 response time: 141.39ms (95%) Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util dm-0 0.00 0.00 127.90 0.70 4070.40 28.00 31.87 1.01 7.83 7.52 96.68 This makes a lot of sense – with read single thread read workload the drive should be only used getting data needed by the query, which will not be 100% as there is some extra time needed to process the query on the MySQL side as well as passing the result set back to sysbench. So 96% utilization; 9 transactions per second, this is a close to full-system capacity with less than 5% of device time to spare, right? Let’s run a benchmark with more concurrency – 4 threads at the time; we’ll see… [ 110s] threads: 4, tps: 21.10, reads/s: 295.40, writes/s: 0.00 response time: 312.09ms (95%) [ 120s] threads: 4, tps: 22.00, reads/s: 308.00, writes/s: 0.00 response time: 297.05ms (95%) [ 130s] threads: 4, tps: 22.40, reads/s: 313.60, writes/s: 0.00 response time: 335.34ms (95%) Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util dm-0 0.00 0.00 295.40 0.90 9372.80 35.20 31.75 4.06 13.69 3.38 100.01 So we’re seeing 100% utilization now, but what is interesting – we’re able to reclaim much more than less than 5% which was left if we look at utilization – throughput of the system increased about 2.5x Finally let’s do the test with 64 threads – this is more concurrency than exists at storage level which is conventional hard drives in RAID on this system… [ 70s] threads: 64, tps: 42.90, reads/s: 600.60, writes/s: 0.00 response time: 2516.43ms (95%) [ 80s] threads: 64, tps: 42.40, reads/s: 593.60, writes/s: 0.00 response time: 2613.15ms (95%) [ 90s] threads: 64, tps: 44.80, reads/s: 627.20, writes/s: 0.00 response time: 2404.49ms (95%) Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util dm-0 0.00 0.00 601.20 0.80 19065.60 33.60 31.73 65.98 108.72 1.66 100.00 In this case we’re getting 4.5x of throughput compared to single thread and 100% utilization. We’re also getting almost double throughput of the run with 4 thread where 100% utilization was reported. This makes sense – there are 4 drives which can work in parallel and with many outstanding requests they are able to optimize their seeks better hence giving a bit more than 4x. So what have we so ? The system which was 96% capacity but which could have driven still to provide 4.5x throughput – so it had plenty of extra IO capacity. More powerful storage might have significantly more ability to run requests in parallel so it is quite possible to see 10x or more room after utilization% starts to be reported close to 100% So if utilization% is not very helpful what can we use to understand our database IO capacity better ? First lets look at the performance reported from those sysbench runs. If we look at 95% response time you can see 1 thread and 4 threads had relatively close 95% time growing just from 150ms to 250-300ms. This is the number I really like to look at- if system is able to respond to the queries with response time not significantly higher than it has with concurrency of 1 it is not overloaded. I like using 3x as multiplier – ie when 95% spikes to be more than 3x of the single concurrency the system might be getting to the overload. With 64 threads the 95% response time is 15-20x of the one we see with single thread so it is surely overloaded. Do we have anything reported by iostat which we can use in a similar way? It turns out we do! Check out the “await” column which tells us how much the requester had to wait for the IO request to be serviced. With single concurrency it is 7.8ms which is what this drives can do for random IO and is as good as it gets. With 4 threads it is 13.7ms – less than double of best possible, so also good enough… with concurrency of 64 it is however 108ms which is over 10x of what this device could produce with no waiting and which is surely sign of overload. A couple words of caution. First, do not look at svctm which is not designed with parallel processing in mind. You can see in our case it actually gets better with high concurrency while really database had to wait a lot longer for requests submitted. Second, iostat mixes together reads and writes in single statistics which specifically for databases and especially on RAID with BBU might be like mixing apples and oranges together – writes should go to writeback cache and be acknowledged essentially instantly while reads only complete when actual data can be delivered. The tool pt-diskstats from Percona Tookit breaks them apart and so can be much more for storage tuning for database workload. Some of the recent operating systems also ship with sysstat/iostat which breaks out await to r_await and w_await which is much more useful. Final note – I used a read-only workload on purpose – when it comes to writes things can be even more complicated – MySQL buffer pool can be more efficient with more intensive writes plus group commit might be able to commit a lot of transactions with single disk write. Still, the same base logic will apply. Summary: The take away is simple – util% only shows if a device has at least one operation to deal with or is completely busy, which does not reflect actual utilization for a majority of modern IO subsystems. So you may have a lot of storage IO capacity left even when utilization is close to 100%.

July 1, 2014

by Peter Zaitsev

· 5,713 Views

Top 10 Hadoop Shell Commands to Manage HDFS

So you already know what Hadoop is? Why it is used? What problems you can solve with it?

June 30, 2014

by Saurabh Chhajed

· 484,820 Views · 8 Likes

The Dangers of SQL JOINs & Aggregate Functions: How to Avoid Fanouts

Every SQL developer uses JOINs and aggregate functions, but some may not be aware that the two interacting can return incorrect results if not handled carefully. Brett Sauve at Looker has covered the issue in this recent post - specifically, he focuses on the issue of "fanouts" - and it's an easy problem to overlook. A fanout occurs when the primary table in a query contains fewer rows than the combined table resulting from a JOIN. Because the new table contains more rows, aggregate functions like COUNT or SUM may include duplicates, throwing off the results. According to Sauve, the answer is careful ordering of JOIN tables: Herein lies a key point: to help avoid fanouts, begin your joins with the most granular table. Sauve also provides some tips to help SQL developers understand how to avoid fanouts and ensure the validity of their data. He points to three types of JOINs: 1-to-1 Many-to-1 1-to-Many Knowing which JOIN you are using can rule out the possibility of a fanout, for example, because it is not possible in a 1-to-1 or Many-to-1 JOIN. If you are using a 1-to-Many JOIN, though, Sauve has some tips there, too. For example, a messy JOIN can be cleaned up by grouping data to create a 1-to-1 relationship. Check out the full article for the rest of Sauve's tips to make sure your aggregate functions are working correctly, and if you're looking for a few more tips to keep your SQL skills sharp, you might try one of these: Yet Another 10 Common Mistakes Java Developers Make When Writing SQL 6 Simple Performance Tips for SQL SELECT Statements

June 30, 2014

by Alec Noller

· 16,008 Views

Making Operations on Volatile Fields Atomic

The expected behaviour for volatile fields is that they should behave in a multi-threaded application the same as they do in a single threaded application. They are not forbidden to behave the same way, but they are not guaranteed to behave the same way. The solution in Java 5.0+ is to use AtomicXxxx classes however these are relatively inefficient in terms of memory (they add a header and padding), performance (they add a references and little control over their relative positions), and syntactically they are not as clear to use. IMHO A simple solution if for volatile fields to act as they might be expected to do, the way JVM must support in AtomicFields which is not forbidden in the current JMM (Java- Memory Model) but not guaranteed. Why make fields volatile? The benefit of volatile fields is that they are visible across threads and some optimisations which avoid re-reading them are disabled so you always check again the current value even if you didn't change them. e.g. without volatile Thread 2: int a = 5; Thread 1: a = 6; (later) Thread 2: System.out.println(a); // prints 5 With volatile Thread 2: volatile int a = 5; Thread 1: a = 6; (later) Thread 2: System.out.println(a); // prints 6 Why not use volatile all the time? Volatile read and write access is substantially slower. When you write to a volatile field it stalls the entire CPU pipeline to ensure the data has been written to cache. Without this, there is a risk the next read of the value sees an old value, even in the same thread (See AtomicLong.lazySet() which avoids stalling the pipeline) The penalty can be in the order of 10x slower which you don't want to be doing on every access. What are the limitations of volatile? A significant limitation is that operations on the field is not atomic, even when you might think it is. Even worse than that is that usually, there is no difference. I.e. it can appear to work for a long time even years and suddenly/randomly break due to an incidental change such as the version of Java used, or even where the object is loaded into memory. e.g. which programs you loaded before running the program. e.g. updating a value Thread 2: volatile int a = 5; Thread 1: a += 1; Thread 2: a += 2; (later) Thread 2: System.out.println(a); // prints 6, 7 or 8 This is an issue because the read of a and the write of a are done separately and you can get a race condition. 99%+ of the time it will behave as expect, but sometimes it won't/ What can you do about it? You need to use AtomicXxxx classes. These wrap volatile fields with operations which behave as expected. Thread 2: AtomicInteger a = new AtomicInteger(5); Thread 1: a.incrementAndGet(); Thread 2: a.addAndGet(2); (later) Thread 2: System.out.println(a); // prints 8 What do I propose? The JVM has a means to behave as expected, the only surprising thing is you need to use a special class to do what the JMM won't guarantee for you. What I propose is that the JMM be changed to support the behaviour currently provided by the concurrency AtomicClasses. In each case the single threaded behaviour is unchanged. A multi-threaded program which does not see a race condition will behave the same. The difference is that a multi-threaded program does not have to see a race condition but changing the underlying behaviour current method suggested syntax notes x.getAndIncrement() x++ or x += 1 x.incrementAndGet() ++x x.getAndDecrment() x-- or x -= 1 x.decrementAndGet() --x x.addAndGet(y) (x += y) x.getAndAdd(y) ((x += y)-y) x.compareAndSet(e, y) (x == e ? x = y, true : false) Need to add the comma syntax used in other languages. These operations could be supported for all the primitive types such as boolean, byte, short, int, long, float and double. Additional assignment operators could be supported such as current method suggested syntax notes Atomic multiplication x *= 2; Atomic subtraction x -= y; Atomic division x /= y; Atomic modulus x %= y; Atomic shift x <<= y; Atomic shift x >>= z; Atomic shift x >>>= w; Atomic and x &= ~y; clears bits Atomic or x |= z; sets bits Atomic xor x ^= w; flips bits What is the risk? This could break code which relies on these operations occasionally failing due to race conditions. It might not be possible to support more complex expressions in a thread safe manner. This could lead to surprising bugs as the code can look like the works, but it doesn't. Never the less it will be no worse than the current state. JEP 193 - Enhanced Volatiles There is a JEP 193 to add this functionality to Java. An example is; class Usage { volatile int count; int incrementCount() { return count.volatile.incrementAndGet(); } } IMHO there is a few limitations in this approach. The syntax is fairly significant change. Changing the JMM might not require many changes the the Java syntax and possibly no changes to the compiler. It is a less general solution. It can be useful to support operations like volume += quantity; where these are double types. It places more burden on the developer to understand why he/she should use this instead of x++; Conclusion Much of the syntactic and performance overhead of using AtomicInteger and AtomicLong could be removed if the JMM guaranteed the equivalent single threaded operations behaved as expected for multi-threaded code. This feature could be added to earlier versions of Java by using byte code instrumentation.

June 27, 2014

by Peter Lawrey

· 11,426 Views

Option.fold() Considered Unreadable

We had a lengthy discussion recently during code review whether scala.Option.fold() is idiomatic and clever or maybe unreadable and tricky? Let's first describe what the problem is. Option.fold does two things: maps a function f overOption's value (if any) or returns an alternative alt if it's absent. Using simple pattern matching we can implement it as follows: val option: Option[T] = //... def alt: R = //... def f(in: T): R = //... val x: R = option match { case Some(v) => f(v) case None => alt } If you prefer one-liner, fold is actually a combination of map and getOrElse val x: R = option map f getOrElse alt Or, if you are a C programmer that still wants to write in C, but using Scala compiler: val x: R = if (option.isDefined) f(option.get) else alt Interestingly this is similar to how fold() is actually implemented, but that's an implementation detail. OK, all of the above can be replaced with single Option.fold(): val x: R = option.fold(alt)(f) Technically you can even use /: and \: operators (alt /: option) - but that would be simply masochistic. I have three problems with option.fold() idiom. First of all - it's anything but readable. We are folding (reducing) over Option - which doesn't really make much sense. Secondly it reverses the ordinary positive-then-negative-case flow by starting with failure (absence, alt) condition followed by presence block (f function; see also: Refactoring map-getOrElse to fold). Interestingly this method would work great for me if it was named mapOrElse: ** * Hypothetical in Option */ def mapOrElse[B](f: A => B, alt: => B): B = this map f getOrElse alt Actually there is already such method in Scalaz, called OptionW.cata. cata. Here is whatMartin Odersky has to say about it: "I personally find methods like cata that take two closures as arguments are often overdoing it. Do you really gain in readability over map + getOrElse? Think of a newcomer to your code[...]"While cata has some theoretical background, Option.fold just sounds like a random name collision that doesn't bring anything to the table, apart from confusion. I know what you'll say, that TraversableOnce has fold and we are sort-of doing the same thing. Why it's a random collision rather than extending the contract described inTraversableOnce? fold() method in Scala collections typically just delegates to one offoldLeft()/foldRight() (the one that works better for given data structure), thus it doesn't guarantee order and folding function has to be associative. But inOption.fold() the contract is different: folding function takes just one parameter rather than two. If you read my previous article about folds you know that reducing function always takes two parameters: current element and accumulated value (initial value during first iteration). But Option.fold() takes just one parameter: current Option value! This breaks the consistency, especially when realizing Option.foldLeft() andOption.foldRight() have correct contract (but it doesn't mean they are more readable). The only way to understand folding over option is to imagine Option as a sequence with0 or 1 elements. Then it sort of makes sense, right? No. def double(x: Int) = x * 2 Some(21).fold(-1)(double) //OK: 42 None.fold(-1)(double) //OK: -1 but: Some(21).toList.fold(-1)(double) : error: type mismatch; found : Int => Int required: (Int, Int) => Int Some(21).toList.fold(-1)(double) ^ If we treat Option[T] as a List[T], awkward Option.fold() breaks because it has different type than TraversableOnce.fold(). This is my biggest concern. I can't understand why folding wasn't defined in terms of the type system (trait?) and implemented strictly. As an example take a look at: Data.Foldable in Haskell (advanced) Data.Foldable typeclass describes various flavours of folding in Haskell. There are familiar foldl/foldr/foldl1/foldr1, in Scala namedfoldLeft/foldRight/reduceLeft/reduceRight accordingly. They have the same type as Scala and behave unsurprisingly with all types that you can fold over, including Maybe, lists, arrays, etc. There is also a function named fold, but it has a completely different meaning: class Foldable t where fold :: Monoid m => t m -> m While other folds are quite complex, this one barely takes a foldable container of ms (which have to be Monoids) and returns the same Monoid type. A quick recap: a type can be aMonoid if there exists a neutral value of that type and an operation that takes two values and produces just one. Applying that function with one of the arguments being neutral value yields the other argument. String ([Char]) is a good example with empty string being neutral value (mempty) and string concatenation being such operation (mappend). Notice that there are two different ways you can construct monoids for numbers: under addition with neutral value being 0 (x + 0 == 0 + x == x for any x) and under multiplication with neutral 1 (x * 1 == 1 * x == x for any x). Let's stick to strings. If I fold empty list of strings, I'll get an empty string. But when a list contains many elements, they are being concatenated: > fold ([] :: [String]) "" > fold [] :: String "" > fold ["foo", "bar"] "foobar" In the first example we have to explicitly say what is the type of empty list []. Otherwise Haskell compiler can't figure out what is the type of elements in a list, thus which monoid instance to choose. In second example we declare that whatever is returned from fold [], it should be a String. From that the compiler infers that [] actually must have a type of [String]. Last fold is the simplest: the program folds over elements in list and concatenates them because concatenation is the operation defined in Monoid Stringtypeclass instance. Back to options (or more precisely Maybe). Folding over Maybe monad having type parameter being Monoid (I can't believe I just said it) has an interesting interpretation: it either returns value inside Maybe or a default Monoid value: > fold (Just "abc") "abc" > fold Nothing :: String "" Just "abc" is same as Some("abc") in Scala. You can see here that if Maybe Stringis Nothing, neutral String monoid value is returned, that is an empty string. Summary Haskell shows that folding (also over Maybe) can be at least consistent. In ScalaOption.fold is unrelated to List.fold, confusing and unreadable. I advise avoiding it and staying with slightly more verbose map/getOrElse transformations or pattern matching. PS: Did I mention there is also Either.fold() (with even different contract) but noTry.fold()?

June 26, 2014

by Tomasz Nurkiewicz

CORE

· 9,201 Views

Eclipse Community Survey 2014 Results

We have published the results of the Eclipse Community Survey 2014. Thank you to everyone who participated in the survey this year. The complete results and data are available for anyone to download [xls] [ods]. As in other years, I think the results provide an interesting perspective on what tools software developers are using and the type of applications they are building. Here are some key highlights from the results this year: 1) Git #1 Code Management Tool. Git has finally surpassed Subversion to be the top code management tool used by software developers. A third of developers (33.3%) report they use Git as their primary code management tool compared to 30.7% using Subversion. Subversion continues to show a downward trend from previous years when it was used by more than half the developers. Of note, 9.6% claim GitHub is their primary code management tool so the prevalence of overall Git usage is becoming dominate. 2) Maven and Jenkins Key Tools. For Build and Release tools, Maven and Jenkins continue to be key tools used by developers. Of interest is the growth of Gradle from 2013 (4.5%) to 2014 (11%). 3) Top 3 Application Servers. Tomcat (32.6%), JBoss (11.8%) and Jetty (7.2%) continue to be the top 3 application servers. 4) Java 8 Adoption. Java 8 was released in March 2014 and already 9.2% of Java developers have migrated to Java 8 as their primary version of Java. 59.2% are using Java 7 but close to a quarter are using Java 6 or before. 5) Majority of Developers Use JavaScript. More and more software developers use multiple languages to develop software. Due to the Eclipse biased of the survey, Java is not surprisingly the top primary language. However, when asked what other languages developers might use, JavaScript stands out to be a popular language with a the majority of developers (56.2%) claiming it as a secondary language. 6) Developers Experimenting With Open Hardware. The Internet of Things (IoT) and Open Hardware have become important industry trends in the last couple of years. Over a third (35.7%) of software developers are spending their own personal time learning about devices like the BeagleBone, Arduino and Raspberry Pi. Thanks again to everyone who participated in the survey. I hope everyone finds the results of interest.

June 25, 2014

by Ian Skerrett

· 13,785 Views

Fingerprint CSS in MVC Web Sites

Optimizing for website performance includes setting long expiration dates on our static resources, such s images, stylesheets and JavaScript files. Doing that tells the browser to cache our files so it doesn’t have to request them every time the user loads a page. This is one of the most important things to do when optimizing websites. In ASP.NET on IIS7+ it’s really easy. Just add this chunk of XML to the web.config’s element: The above code tells the browsers to automatically cache all static resources for 365 days. That’s good and you should do this right now. The issue becomes clear the first time you make a change to any static file. How is the browser going to know that you made a change, so it can download the latest version of the file? The answer is that it can’t. It will keep serving the same cached version of the file for the next 365 days regardless of any changes you are making to the files. Fingerprinting The good news is that it is fairly trivial to make a change to our code, that changes the URL pointing to the static files and thereby tricking the browser into believing it’s a brand new resource that needs to be downloaded. Here’s a little class that I use on several websites, that adds a fingerprint, or timestamp, to the URL of the static file. using System; using System.IO; using System.Web; using System.Web.Caching; using System.Web.Hosting; public class Fingerprint { public static string Tag(string rootRelativePath) { if (HttpRuntime.Cache[rootRelativePath] == null) { string absolute = HostingEnvironment.MapPath("~" + rootRelativePath); DateTime date = File.GetLastWriteTime(absolute); int index = rootRelativePath.LastIndexOf('/'); string result = rootRelativePath.Insert(index, "/v-" + date.Ticks); HttpRuntime.Cache.Insert(rootRelativePath, result, new CacheDependency(absolute)); } return HttpRuntime.Cache[rootRelativePath] as string; } } All you need to change in order to use this class, is to modify the references to the static files. Modify references Here’s what it looks like in Razor for the stylesheet reference: …and in WebForms: content/site.css") %>" /> The result of using the FingerPrint.Tag method will in this case be: Since the URL now has a reference to a non-existing folder (v-634933238684083941), we need to make the web server pretend it exist. We do that with URL rewriting. URL rewrite By adding this snippet of XML to the web.config’s section, we instruct IIS 7+ to intercept all URLs with a folder name containing “v=[numbers]” and rewrite the URL to the original file path. You can use this technique for all your JavaScript and image files as well. The beauty is, that every time you change one of the referenced static files, the fingerprint will change as well. This creates a brand new URL every time so the browsers will download the updated files. FYI, you need to run the AppPool in Integrated Pipeline mode for the section to have any effect.

June 20, 2014

by Merrick Chaffer

· 11,361 Views

Java Code Review Checklist

Utilize this checklist to review the quality of your Java code, including security, performance, and static code analysis.

June 20, 2014

by Mahesh Chopker

· 208,156 Views · 22 Likes

How to Install Mono on a Raspberry Pi

This post exists to help with an MSDN Magazine article that I am authoring It provides some of the low-level details for the article How to install Mono and root certificates on a raspberry pi How to create an Azure mobile service How to create a Custom API inside Azure mobile services that the raspberry pi can call into How to create an Azure storage account MONO - HOW TO INSTALL ON A RASPBERRY PI Why Mono? How to install Mono on a raspberry pi Installing trusted root certificates on to the raspberry pi http://www.mono-project.com/Main_Page An open source, cross-platform, implementation of C# and the CLR that is binary compatible with Microsoft.NET Mono is a free and open source project led by Xamarin (formerly by Novell) that provides a .NET Framework-compatible set of tools including, among others, a C# compiler and a Common Language Runtime WHY MONO? Because it lets us write .net code compiled on Windows We can simply copy the binary files from Windows to Linux and run it as is From a raspberry pi device, it is possible to use a .net application to take a photo and upload it to Windows Azure storage HOW TO INSTALL ON A RASPBERRY PI RUNNING LINUX You will issue the following commands: pi@raspberrypi ~ $ sudo apt-get update pi@raspberrypi ~ $ sudo apt-get install mono-complete The first command makes sure all the local package index are up to date with the changes made in repositories. Second command installs the complete Mono tooling and runtime. MAKING SURE THAT YOUR MONO APPLICATIONS CAN MAKE A HTTPS REST-BASED CALLS This command downloads the trusted root certificates from the Mozilla LXR web site into the Mono certificate store. Once complete, the Raspberry PI will be capable of making web requests using HTTPS requests within Mono. pi@raspberrypi ~ $ mozroots --import --ask-remove --machine CREATING A NEW AZURE MOBILE SERVICES ACCOUNT The mobile services account is needed to host a Node.js application that provides shared access signatures to raspberry pi devices The shared access signature is needed by the raspberry pi, so that it can directly and securely upload photos to Azure storage STEPS TO CREATE AN AZURE MOBILE SERVICE The steps below will create an Azure mobile service The service will be used to host a Node.js application interacting with a raspberry pi devices We will provision a SQL database, although it will not be used initially FOLLOW THESE STEPS TO CREATE THE MOBILE SERVICE Login into the Azure Portal Select MOBILE SERVICES from the left menu pane at the Azure Portal. In the lower left corner select "+NEW" to create a new Azure Mobile Service. Make sure you've selected, "COMPUTE / MOBILE SERVICE / CREATE." You will now enter a url. We will call this service raspberrymobileservice. For the DATABASE, we will choose "Create a new SQL database instance." The REGION we chose is "West US." The BACKEND is "JavaScript." Click the "->" arrow to proceed to the next screen. In this screen you will "Specify database settings." The NAME of your database will based on the URL you entered previously. In this case, the database is called "raspberrymobileservice_db." You will need to choose a SERVER. We will choose "New SQL database server" from the drop-down list. You will need to provide a SERVER LOGIN NAME and a SERVER LOGIN PASSWORD. Take note of the login you provided as it will be needed later CREATING A CUSTOM API Azure mobile services allows you to create a custom API written in JavaScript that can be called from a raspberry pi device using REST This custom API is really just a Node.js application running in the server CREATING THE API TO RESPOND TO THE DEVICE TRYING TO UPLOAD PHOTOS Now that the service is established, we will turn our attention to creating an API that the device can call into to upload a photo. Login into the Azure Portal Your mobile service will take a few minutes to complete, and you should see the "Ready" flag as the "Status" for your service. Once it is ready you can drill into your service to customize its behavior. Just to the right of the service name, click the right arrow key "->" to drill into the service details. The top menu bar will offer many options, but we are interested in the one titled "API." The API allows you to create a series of node.JS API calls that a device can call into using rest-based approaches. Click on "API." from there, select "CREATE A CUSTOM API." You will be asked to provide an API name. Type in "photos" for the API name. Below you will see a series of drop-down combo boxes that relate to permission. We will keep the default value of "Anybody with the application key." This might not be the best option for all scenarios. You can read more about this here. http://msdn.microsoft.com/en-us/library/azure/jj193161.aspx. Click the checkmark to complete the process. The name of the AP you just created, "Photos," should be visible on the portal interface. To drill into the photos API click on the right arrow key "->". The right arrow key will be just to the right of the name of the API "Photos". At this point you should see a basic script that has been provided by default. We will overwrite this default script with our own script as described in the MSDN Magazine article. CREATING A STORAGE ACCOUNT TO STORE THE PHOTOS Navigate to the portal and create a storage account Create a container for the photos Obtain the: Storage Account Name (you will provide a name) Storage Account Access key (generated for you) Container Name (you will create) CREATING A STORAGE ACCOUNT We will need a storage account so that we can upload photos to it. The steps are well documented here: http://azure.microsoft.com/en-us/documentation/articles/storage-create-storage-account/ In our case we call the storage account raspberrystorage. This means that the URL that the device will use to upload photos is https://raspberrystorage.blob.core.windows.net/. As you complete these steps make sure that you choose the storage account location to be the same location as was used for your mobile services account. This avoids any unnecessary latency or bandwidth costs between data centers. Once the storage account is created, we will need to create a container within it. Photos or any blob for that matter, are always stored within a container. To create a container drill into your newly created storage account and select CONTAINERS from the top menu. From there, select CREATE A CONTAINER. The new container dialog box will ask for a name for your container. Take note of the name you provide. We are calling our container ?photocontainer.? When the raspberry pi device uploads photos to the storage account, it will target a specific container, such as the one we just created. You will next be asked to indicate ACCESS rights. To keep things simple we will select access rights of Public Blob. ENTERING APP SETTINGS Rather than hard-code storage account information inside your JavaScript/Node.js applications, you should consider using apps settings inside of the Azure mobile services portal This post also discusses it well: http://blogs.msdn.com/b/carlosfigueira/archive/2013/12/09/application-settings-in-azure-mobile-services.aspx ?The idea of application settings is a set of key-value pairs which can be set for the mobile service (either via the portal or via the command-line interface), and those values could be then read in the service runtime.? NAVIGATING TO APP SETTINGS Navigate to the Azure Mobile Services section of the portal. Drill into the specific service by hitting the arrow below Select from the Configure Menu at the top Scroll down to the very bottom to see app settings Note that we need to enter: - We need to get this from Azure Storage - PhotoContainerName - AccountName - AccountKey We get this information from the Azure Storage Section of the Portal. Note that you need to have provisioned a Storage Account to have this information. How to get the AccountKey with Azure Storage Services Now you can get the access keys HOW NODE.JS WILL ACCESS THE APP SETTINGS You will create a Node.js application inside of Azure Mobile Services See previous steps THE NODE.JS APPLICATION READING APP SETTINGS You will starting by going back to Azure Mobile Services and drill down into your newly minted service We called ours raspberrymobileservice Once you click API, you should see: Notice the app settings are being read on lines 12 to 14.

June 19, 2014

by Bruno Terkaly

· 16,246 Views