Choose The Right Data Structure at The Right Time and Right Place
These days we have a basket full of various data structures, that most of which have similar functions. So which one is correct for my current problem?
Join the DZone community and get the full member experience.
Join For FreeSometimes it may happen for all of us as programmers, which data structure is suitable for this part of code? These days we have a basket full of various data structures, that most of which have similar functions. So which one is correct for my current problem? Ok now, I need to list the data structure in this part of my code which one is better? LinkedList or ArrayList, for answering these questions we must first have a clear understanding of the problem.
There is a very sacred book on computer science named Introduction to Algorithms, or briefly called CLRS, I suggest that everyone read this book at least once.
This book has a chapter named Complexity of Algorithms.
Types of Complexity
Okay, let's get to the point. As you can see, one of the most important factors in choosing a data structure is the complexity of the algorithms used in the program.
It means that before you choose the data structure from multiple choices you should detect first,
what is the dominant function in your application? And then you can choose the right data structure according to this function. For example, if most of your operations are 'insert' you should choose one data structure that has better complexity in the 'insert' function or if the most of operations are 'select' you should select another one that has good complexity in the 'select' operation.
So, what is complexity? The complexity of an algorithm is a measure of the amount of time and/or space required by an algorithm for an input of a given size (n).
There are 3 types of complexity:
1. Worst Case
... is the maximum run time, overall inputs of size n, ignoring effects (a) through (d) above. That is, we only consider the "number of times the principal activity of that algorithm is performed."
2. Best Case
In this case, we look at specific instances of input of size n. For example, we might get the best behavior from a sorting algorithm if the input to it is already sorted.
3. Average Case
Average-case complexity of an algorithm is the amount of some computational resource (typically time) used by the algorithm, averaged over all possible inputs. It is frequently contrasted with worst-case complexity which considers the maximal complexity of the algorithm over all possible inputs.
We mostly consider a worst-case in our selection, which means that most of the time we select data structures with better worst case, the worst-case presented with big-O, for example, O(1), O(N), etc.
Algorithms with O(1) are the best. Here is some regular complexity:
O(1) < O(log n) < O(n) < O(n log n)<O(n^2)<...
Example
Okay, let's look at an example for a better understanding. Every literature says that:
LinkedList
has the complexity of O(N) inget
value by index.ArrayList
has the complexity of O(1) inget
value by index.
Theoretically, it means ArrayList has better performance in get
value by index than LinkedList.
Let's look at this intuitively together in the code.
First, we have a class that checks ArrayList
complexity, the first array filled by 1000000 entries, then all of these items have get
s with index :
public class ArrayListComplexity {
public static void main(String args[]){
List<Integer> array=new ArrayList<>();
for(int i=0;i<1000000;i++)
array.add(i);
Long startTime=Calendar.getInstance().getTime().getTime();
for(int i=0;i<1000000;i++)
array.get(i);
System.out.println("time spent for GET in ArrayList :"+
(Calendar.getInstance().getTime().getTime() -startTime));
}
}
As you can see, wasted time is :13
.
time spent for GET in ArrayList :13
Now repeat the same experiment with the same assumption with LinkedList
:
public class LinkedListComplexity {
public static void main(String args[]){
List<Integer> linked=new LinkedList<>();
for(int i=0;i<1000000;i++)
linked.add(i);
Long startTime=Calendar.getInstance().getTime().getTime();
for(int i=0;i<1000000;i++)
linked.get(i);
System.out.println("time spent for GET in LinkedList :"+
(Calendar.getInstance().getTime().getTime() -startTime));
}
As you can see, the duration is 618443
.
time spent for GET in LinkedList :618443
The Complexity of Some Important Data Structures
In this part, I tried to gather some famous and versatile data structures with the complexity of them:
- ArrayList
Add | Get | Remove |
O(n) | O(1) | O(n) |
- LinkedList
Add | Get | Remove |
O(1) | O(n) | O(1) |
- HashSet
Add | Remove | Contains | Next |
O(1) | O(1) | O(1) | O(h/n) |
- LinkedHashSet
Add | Remove | Contains | Next |
O(1) | O(1) | O(1) | O(1) |
- EnumSet
Add | Remove | Contains | Next |
O(1) | O(1) | O(1) | O(1) |
- TreeSet
Add | Remove | Contains | Next |
O(log n) | O(log n) | O(log n) | O(log n) |
- HashMap
Get | ContainsKey | Next |
O(1) | O(1) | O(h /n ) |
- LinkedHashMap
Get | ContainsKey | Next |
O(1) | O(1) | O(1) |
- IdentityHashMap
Get | ContainsKey | Next |
O(1) | O(1) | O(h/n) |
- TreeMap
Get | ContainsKey | Next |
O(log n) | O(log n) | O(log n) |
- ConcurrentHashMap
Get | ContainsKey | Next |
O(1) | O(1) | O(h/n) |
Note: All of the above complexity tables are Time Complexity. but there is another type of complexity: Space complexity, which relies on space used by the data structure. Personally, I believe that if you have a data structure with good time complexity in your data structures, you can manage your space with good algorithms, paging, as well as good design.
Opinions expressed by DZone contributors are their own.
Comments