JDK_The Collections Framework

img

preface

问题描述

最近有个二货在线上使用Collections#sort()的时候报了个错, java.lang.IllegalArgumentException: Comparison method violates its general contract! 运行的JDK环境是1.7

image-20190729155205469

本着博爱的精神,看了看之前也遇到类似的问题.就决定总结下,其实这个是JDK1.6到1.7的兼容问题.在https://www.oracle.com/technetwork/java/javase/compatibility-417013.html#incompatibilities 中可以发现

Area: API: Utilities
Synopsis: Updated sort behavior for Arrays and Collections may throw an IllegalArgumentException
Description: The sorting algorithm used by java.util.Arrays.sort and (indirectly) by java.util.Collections.sort has been replaced. The new sort implementation may throw an IllegalArgumentException if it detects a Comparable that violates the Comparable contract. The previous implementation silently ignored such a situation.
If the previous behavior is desired, you can use the new system property, java.util.Arrays.useLegacyMergeSort, to restore previous mergesort behavior.
Nature of Incompatibility: behavioral
RFE: 6804124

​ 如果违法了比较的约束新的排序算法也许会抛出llegalArgumentException异常。JDK6中的实现则忽略了这种情况

根本原因

实现Comparable接口后,要满足一下三个特性:

  • 自反性:x,y 的比较结果和 y,x 的比较结果相反。 sgn(compare(x, y)) == -sgn(compare(y, x))
  • 传递性:x>y, y>z, 则 x>z。((compare(x, y)>0) && (compare(y, z)>0)) implies compare(x, z)>0
  • 对称性:x=y,则 x,z 比较结果和 y,z 比较结果相同。 compare(x, y) == 0 implies that sgn(compare(x, z)) == sgn(compare(y, z)) for all z

​ 开头的那个小朋友实现类似如下,明显没有满足自反性

return x > y ? 1 : -1;

当x == y时,sgn(compare(x, y)) = -1,-sgn(compare(y, x)) = 1,这违背了sgn(compare(x, y)) == -sgn(compare(y, x))约束,所以在JDK7中抛出了异常。

后续

这里贴个可以重现该错误的代码

import java.util.Arrays;
import java.util.List;
import java.util.function.Consumer;

/**
 *
 * Collections sort 要满足自反性否则会有概率出错,在1.7中会有问题
 * @author zhengyumin
 * @create 2019-07-18 10:24
 */
public class CollectionSort {
    public static void main(String[] args) {
        List<Integer> list = Arrays.asList(0, 0, 0, 0, 0, 0, 0, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,0, 0, 0, 1, 1, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 2, 1, 0, 0, 0, 2, 30, 0, 3);
        list.sort((o1, o2) -> o1 > o2 ? 1 : -1);
        list.stream().forEach((Consumer) o -> System.out.println(o));
    }
}

上面描述的只是集合框架中的一个问题. 好了,回到我们今天的主题The Collections Framework.

相关资料可以参考:https://docs.oracle.com/javase/8/docs/technotes/guides/collections/index.html

The Collections Framework

API Enhancements(各版本API增强变化)

  • API Enhancements in Java SE 8 - An annotated list of API changes between release 7 and 8.

    • Support for Lambda Expressions, Streams, and Aggregate Operations
    • Performance Improvement for HashMaps with Key Collisions
  • API Enhancements in Java SE 7 - An annotated list of API changes between release 6 and 7.

    • New Interface, TransferQueue
    • Improved Hash Function
  • API Enhancements in Java SE 6 - An annotated list of API changes between release 5.0 and 6.

    • These new collection interfaces are provided:
      • Deque
      • BlockingDeque
      • NavigableSet
      • NavigableMap
      • ConcurrentNavigableMap
  • API Enhancements in J2SE 5.0 - An annotated list of API changes between release 1.4 and 5.0.

    • Three new collection interfaces are provided:
      • Queue
      • BlockingQueue
      • ConcurrentMap
    • Three new generic algorithms and one comparator converter were added to the Collections utility class:
      • frequency(Collection<?> c, Object o)
      • disjoint(Collection<?> c1, Collection<?> c2)
      • addAll(Collection<? super T> c, T... a)
      • Comparator<T> reverseOrder(Comparator<T> cmp)
  • API Enhancements in J2SDK 1.4 - An annotated list of API changes between release 1.3 and 1.4.

    • The Collections utility class has several new methods:

      • rotate(List list, int distance)
      • replaceAll(List list, Object oldVal, Object newVal)
      • indexOfSubList(List source, List target)
      • lastIndexOfSubList(List source, List target)
      • swap(List list, int i, int j)
      • list(Enumeration e)
    • New interface RandomAccess is a marker interface that allows List implementations to indicate that they support fast (generally constant time) random access.

    • New class LinkedHashMap provides an insertion-ordered Map implementation that runs nearly as fast as HashMap.

    • New class IdentityHashMap is an identity-based Map implementation based on a hash table.

Introduction to Collections (集合介绍)

​ 为什么需要集合框架?集合框架的组成?

What Is a Collections Framework?

A collections framework is a unified architecture for representing and manipulating collections. All collections frameworks contain the following:

  • Interfaces: These are abstract data types that represent collections. Interfaces allow collections to be manipulated independently of the details of their representation. In object-oriented languages, interfaces generally form a hierarchy.
  • Implementations: These are the concrete implementations of the collection interfaces. In essence, they are reusable data structures.
  • Algorithms: These are the methods that perform useful computations, such as searching and sorting, on objects that implement collection interfaces. The algorithms are said to be polymorphic: that is, the same method can be used on many different implementations of the appropriate collection interface. In essence, algorithms are reusable functionality.

所有集合框架都包含如下: 接口、实现、算法

more details , The collections framework consists of:

  • Collection interfaces (接口). Represent different types of collections, such as sets, lists, and maps. These interfaces form the basis of the framework.
    • Collection - A group of objects. No assumptions are made about the order of the collection (if any) or whether it can contain duplicate elements.
    • Set - The familiar set abstraction. No duplicate elements permitted. May or may not be ordered. Extends the Collection interface.
    • List - Ordered collection, also known as a sequence. Duplicates are generally permitted. Allows positional access. Extends the Collection interface.
    • Queue - A collection designed for holding elements before processing. Besides basic Collection operations, queues provide additional insertion, extraction, and inspection operations.
    • Deque - A double ended queue, supporting element insertion and removal at both ends. Extends the Queue interface.
    • Map - A mapping from keys to values. Each key can map to one value.
    • SortedSet - A set whose elements are automatically sorted, either in their natural ordering (see the Comparable interface) or by a Comparator object provided when a SortedSet instance is created. Extends the Set interface.
    • SortedMap - A map whose mappings are automatically sorted by key, either using the natural ordering of the keys or by a comparator provided when a SortedMap instance is created. Extends the Map interface.
    • NavigableSet - A SortedSet extended with navigation methods reporting closest matches for given search targets. A NavigableSet may be accessed and traversed in either ascending or descending order.
    • NavigableMap - A SortedMap extended with navigation methods returning the closest matches for given search targets. A NavigableMap can be accessed and traversed in either ascending or descending key order.
    • BlockingQueue - A Queue with operations that wait for the queue to become nonempty when retrieving an element and that wait for space to become available in the queue when storing an element. (This interface is part of the java.util.concurrent package.)
    • TransferQueue - A BlockingQueue in which producers can wait for consumers to receive elements. (This interface is part of the java.util.concurrent package.)
    • BlockingDeque - A Deque with operations that wait for the deque to become nonempty when retrieving an element and wait for space to become available in the deque when storing an element. Extends both the Deque and BlockingQueue interfaces. (This interface is part of the java.util.concurrent package.)
    • ConcurrentMap - A Map with atomic putIfAbsent, remove, and replace methods. (This interface is part of the java.util.concurrent package.)
    • ConcurrentNavigableMap - A ConcurrentMap that is also a NavigableMap.
  • General-purpose implementations (通常实现). Primary implementations of the collection interfaces.
    • HashSet - Hash table implementation of the Set interface. The best all-around implementation of the Set interface.
    • TreeSet - Red-black tree implementation of the NavigableSet interface.
    • LinkedHashSet - Hash table and linked list implementation of the Set interface. An insertion-ordered Set implementation that runs nearly as fast as HashSet.
    • ArrayList - Resizable array implementation of the List interface (an unsynchronized Vector). The best all-around implementation of the List interface.
    • ArrayDeque - Efficient, resizable array implementation of the Deque interface.
    • LinkedList - Doubly-linked list implementation of the List interface. Provides better performance than the ArrayList implementation if elements are frequently inserted or deleted within the list. Also implements the Deque interface. When accessed through the Queue interface, LinkedList acts as a FIFO queue.
    • PriorityQueue - Heap implementation of an unbounded priority queue.
    • HashMap - Hash table implementation of the Map interface (an unsynchronized Hashtable that supports null keys and values). The best all-around implementation of the Map interface.
    • TreeMap Red-black tree implementation of the NavigableMap interface.
    • LinkedHashMap - Hash table and linked list implementation of the Map interface. An insertion-ordered Map implementation that runs nearly as fast as HashMap. Also useful for building caches (see removeEldestEntry(Map.Entry) ).
  • Legacy implementations. The collection classes from earlier releases, Vector and Hashtable, were retrofitted to implement the collection interfaces.
    • Vector - Synchronized resizable array implementation of the List interface with additional legacy methods.
    • Hashtable - Synchronized hash table implementation of the Map interface that does not allow null keys or values, plus additional legacy methods.
  • Special-purpose implementations (特殊实现). Implementations designed for use in special situations. These implementations display nonstandard performance characteristics, usage restrictions, or behavior.
    • WeakHashMap - An implementation of the Map interface that stores only weak references to its keys. Storing only weak references enables key-value pairs to be garbage collected when the key is no longer referenced outside of the WeakHashMap. This class is the easiest way to use the power of weak references. It is useful for implementing registry-like data structures, where the utility of an entry vanishes when its key is no longer reachable by any thread.
    • IdentityHashMap - Identity-based Map implementation based on a hash table. This class is useful for topology-preserving object graph transformations (such as serialization or deep copying). To perform these transformations, you must maintain an identity-based “node table” that keeps track of which objects have already been seen. Identity-based maps are also used to maintain object-to-meta-information mappings in dynamic debuggers and similar systems. Finally, identity-based maps are useful in preventing “spoof attacks” resulting from intentionally perverse equals methods. (IdentityHashMap never invokes the equals method on its keys.) An added benefit of this implementation is that it is fast.
    • CopyOnWriteArrayList - A List implementation backed by an copy-on-write array. All mutative operations (such as add, set, and remove) are implemented by making a new copy of the array. No synchronization is necessary, even during iteration, and iterators are guaranteed never to throwConcurrentModificationException. This implementation is well-suited to maintaining event-handler lists (where change is infrequent, and traversal is frequent and potentially time-consuming).
    • CopyOnWriteArraySet - A Set implementation backed by a copy-on-write array. This implementation is similar to CopyOnWriteArrayList. Unlike most Set implementations, the add, remove, and contains methods require time proportional to the size of the set. This implementation is well suited to maintaining event-handler lists that must prevent duplicates.
    • EnumSet - A high-performance Set implementation backed by a bit vector. All elements of each EnumSet instance must be elements of a single enum type.
    • EnumMap - A high-performance Map implementation backed by an array. All keys in each EnumMap instance must be elements of a single enum type.
  • Concurrent implementations (并发实现). Implementations designed for highly concurrent use.
    • ConcurrentLinkedQueue - An unbounded first in, first out (FIFO) queue based on linked nodes.
    • LinkedBlockingQueue - An optionally bounded FIFO blocking queue backed by linked nodes.
    • ArrayBlockingQueue - A bounded FIFO blocking queue backed by an array.
    • PriorityBlockingQueue - An unbounded blocking priority queue backed by a priority heap.
    • DelayQueue - A time-based scheduling queue backed by a priority heap.
    • SynchronousQueue - A simple rendezvous mechanism that uses the BlockingQueue interface.
    • LinkedBlockingDeque - An optionally bounded FIFO blocking deque backed by linked nodes.
    • LinkedTransferQueue - An unbounded TransferQueue backed by linked nodes.
    • ConcurrentHashMap - A highly concurrent, high-performance ConcurrentMap implementation based on a hash table. This implementation never blocks when performing retrievals and enables the client to select the concurrency level for updates. It is intended as a drop-in replacement for Hashtable. In addition to implementing ConcurrentMap, it supports all of the legacy methods of Hashtable.
    • ConcurrentSkipListSet - Skips list implementation of the NavigableSet interface.
    • ConcurrentSkipListMap - Skips list implementation of the ConcurrentNavigableMap interface.
  • Wrapper implementations . Add functionality, such as synchronization, to other implementations.
    • Collections.unmodifiableInterface - Returns an unmodifiable view of a specified collection that throws an UnsupportedOperationException if the user attempts to modify it.
    • Collections.synchronizedInterface - Returns a synchronized collection that is backed by the specified (typically unsynchronized) collection. As long as all accesses to the backing collection are through the returned collection, thread safety is guaranteed.
    • Collections.checkedInterface - Returns a dynamically type-safe view of the specified collection, which throws a ClassCastException if a client attempts to add an element of the wrong type. The generics mechanism in the language provides compile-time (static) type checking, but it is possible to bypass this mechanism. Dynamically type-safe views eliminate this possibility.
  • Convenience implementations. High-performance “mini-implementations” of the collection interfaces.
    • Arrays.asList - Enables an array to be viewed as a list.
    • emptySet, emptyList and emptyMap - Return an immutable empty set, list, or map.
    • singleton, singletonList, and singletonMap - Return an immutable singleton set, list, or map, containing only the specified object (or key-value mapping).
    • nCopies - Returns an immutable list consisting of n copies of a specified object.
  • Abstract implementations. Partial implementations of the collection interfaces to facilitate custom implementations.
    • AbstractCollection - Skeletal Collection implementation that is neither a set nor a list (such as a “bag” or multiset).
    • AbstractSet - Skeletal Set implementation.
    • AbstractList - Skeletal List implementation backed by a random access data store (such as an array).
    • AbstractSequentialList - Skeletal List implementation backed by a sequential access data store (such as a linked list).
    • AbstractQueue - Skeletal Queue implementation.
    • AbstractMap - Skeletal Map implementation.
  • Algorithms. Static methods that perform useful functions on collections, such as sorting a list.
    • sort(List) - Sorts a list using a merge sort algorithm, which provides average case performance comparable to a high quality quicksort, guaranteed O(n*log n) performance (unlike quicksort), and stability (unlike quicksort). A stable sort is one that does not reorder equal elements.
    • binarySearch(List, Object) - Searches for an element in an ordered list using the binary search algorithm.
    • reverse(List) - Reverses the order of the elements in a list.
    • shuffle(List) - Randomly changes the order of the elements in a list.
    • fill(List, Object) - Overwrites every element in a list with the specified value.
    • copy(List dest, List src) - Copies the source list into the destination list.
    • min(Collection) - Returns the minimum element in a collection.
    • max(Collection) - Returns the maximum element in a collection.
    • rotate(List list, int distance) - Rotates all of the elements in the list by the specified distance.
    • replaceAll(List list, Object oldVal, Object newVal) - Replaces all occurrences of one specified value with another.
    • indexOfSubList(List source, List target) - Returns the index of the first sublist of source that is equal to target.
    • lastIndexOfSubList(List source, List target) - Returns the index of the last sublist of source that is equal to target.
    • swap(List, int, int) - Swaps the elements at the specified positions in the specified list.
    • frequency(Collection, Object) - Counts the number of times the specified element occurs in the specified collection.
    • disjoint(Collection, Collection) - Determines whether two collections are disjoint, in other words, whether they contain no elements in common.
    • addAll(Collection<? super T>, T…) - Adds all of the elements in the specified array to the specified collection.
  • Infrastructure. Interfaces that provide essential support for the collection interfaces.
    • Iterators- Similar to the familiarEnumeration interface, but more powerful, and with improved method names.
      • Iterator - In addition to the functionality of the Enumeration interface, enables the user to remove elements from the backing collection with well-defined, useful semantics.
      • ListIterator - Iterator for use with lists. In addition to the functionality of the Iterator interface, supports bidirectional iteration, element replacement, element insertion, and index retrieval.
    • Ordering
      • Comparable - Imparts a natural ordering to classes that implement it. The natural ordering can be used to sort a list or maintain order in a sorted set or map. Many classes were retrofitted to implement this interface.
      • Comparator - Represents an order relation, which can be used to sort a list or maintain order in a sorted set or map. Can override a type’s natural ordering or order objects of a type that does not implement the Comparable interface.
    • Runtime exceptions
      • UnsupportedOperationException - Thrown by collections if an unsupported optional operation is called.
      • ConcurrentModificationException - Thrown by iterators and list iterators if the backing collection is changed unexpectedly while the iteration is in progress. Also thrown by sublist views of lists if the backing list is changed unexpectedly.
    • Performance
      • RandomAccess - Marker interface that lets List implementations indicate that they support fast (generally constant time) random access. This lets generic algorithms change their behavior to provide good performance when applied to either random or sequential access lists.
  • Array Utilities. Utility functions for arrays of primitive types and reference objects. Not, strictly speaking, a part of the collections framework, this feature was added to the Java platform at the same time as the collections framework and relies on some of the same infrastructure.
    • Arrays - Contains static methods to sort, search, compare, hash, copy, resize, convert to String, and fill arrays of primitives and objects.

更多细节:https://docs.oracle.com/javase/8/docs/technotes/guides/collections/reference.html

Benefits of the Java Collections Framework(实现集合框架的好处)

The Java Collections Framework provides the following benefits:

  • Reduces programming effort: By providing useful data structures and algorithms, the Collections Framework frees you to concentrate on the important parts of your program rather than on the low-level “plumbing” required to make it work. By facilitating interoperability among unrelated APIs, the Java Collections Framework frees you from writing adapter objects or conversion code to connect APIs.
  • Increases program speed and quality: This Collections Framework provides high-performance, high-quality implementations of useful data structures and algorithms. The various implementations of each interface are interchangeable, so programs can be easily tuned by switching collection implementations. Because you’re freed from the drudgery of writing your own data structures, you’ll have more time to devote to improving programs’ quality and performance.
  • Allows interoperability among unrelated APIs: The collection interfaces are the vernacular by which APIs pass collections back and forth. If my network administration API furnishes a collection of node names and if your GUI toolkit expects a collection of column headings, our APIs will interoperate seamlessly, even though they were written independently.
  • Reduces effort to learn and to use new APIs: Many APIs naturally take collections on input and furnish them as output. In the past, each such API had a small sub-API devoted to manipulating its collections. There was little consistency among these ad hoc collections sub-APIs, so you had to learn each one from scratch, and it was easy to make mistakes when using them. With the advent of standard collection interfaces, the problem went away.
  • Reduces effort to design new APIs: This is the flip side of the previous advantage. Designers and implementers don’t have to reinvent the wheel each time they create an API that relies on collections; instead, they can use standard collection interfaces.
  • Fosters software reuse: New data structures that conform to the standard collection interfaces are by nature reusable. The same goes for new algorithms that operate on objects that implement these interfaces.

Collection Interfaces

Two interface trees, one starting with Collection and including Set, SortedSet, List, and Queue, and the other starting with Map and including SortedMap.

主要分成两部分,Collection和Map

The collection interfaces are divided into two groups. The most basic interface, java.util.Collection, has the following descendants:

  • java.util.Set
  • java.util.SortedSet
  • java.util.NavigableSet
  • java.util.Queue
  • java.util.concurrent.BlockingQueue
  • java.util.concurrent.TransferQueue
  • java.util.Deque
  • java.util.concurrent.BlockingDeque

The other collection interfaces are based on java.util.Map and are not true collections. However, these interfaces contain collection-view operations, which enable them to be manipulated as collections. Map has the following offspring:

  • java.util.SortedMap
  • java.util.NavigableMap
  • java.util.concurrent.ConcurrentMap
  • java.util.concurrent.ConcurrentNavigableMap

专业术语, modifiable、mutable、variable-size、sequential access

Several terms are introduced to aid in this specification:

  • Collections that do not support modification operations (such as add, remove and clear) are referred to as unmodifiable. Collections that are not unmodifiable are modifiable.
  • Collections that additionally guarantee that no change in the Collection object will be visible are referred to as immutable. Collections that are not immutable are mutable.
  • Lists that guarantee that their size remains constant even though the elements can change are referred to as fixed-size. Lists that are not fixed-size are referred to as variable-size.
  • Lists that support fast (generally constant time) indexed element access are known as random access lists. Lists that do not support fast indexed element access are known as sequential access lists. The RandomAccess marker interface enables lists to advertise the fact that they support random access. This enables generic algorithms to change their behavior to provide good performance when applied to either random or sequential access lists.

Collection Implementations

Classes that implement the collection interfaces typically have names in the form of <*Implementation-style*><*Interface*>. The general purpose implementations are summarized in the following table:

Interface Hash Table Resizable Array Balanced Tree Linked List Hash Table + Linked List
Set HashSet TreeSet LinkedHashSet
List ArrayList LinkedList
Deque ArrayDeque LinkedList
Map HashMap TreeMap LinkedHashMap

The general-purpose implementations support all of the optional operations in the collection interfaces and have no restrictions on the elements they may contain. They are unsynchronized, but the Collections class contains static factories called synchronization wrappers that can be used to add synchronization to many unsynchronized collections. All of the new implementations have fail-fast iterators, which detect invalid concurrent modification, and fail quickly and cleanly (rather than behaving erratically).

​ The AbstractCollection, AbstractSet, AbstractList, AbstractSequentialList and AbstractMap classes provide basic implementations of the core collection interfaces, to minimize the effort required to implement them. The API documentation for these classes describes precisely how each method is implemented so the implementer knows which methods must be overridden, given the performance of the basic operations of a specific implementation.

Concurrent Collections

Applications that use collections from more than one thread must be carefully programmed. In general, this is known as concurrent programming. The Java platform includes extensive support for concurrent programming. See Java Concurrency Utilities for details.

Collections are so frequently used that various concurrent friendly interfaces and implementations of collections are included in the APIs. These types go beyond the synchronization wrappers discussed previously to provide features that are frequently needed in concurrent programming.

interfaces

These concurrent-aware interfaces are available:

  • BlockingQueue
  • TransferQueue
  • BlockingDeque
  • ConcurrentMap
  • ConcurrentNavigableMap

implementation

The following concurrent-aware implementation classes are available. See the API documentation for the correct usage of these implementations.

  • LinkedBlockingQueue
  • ArrayBlockingQueue
  • PriorityBlockingQueue
  • DelayQueue
  • SynchronousQueue
  • LinkedBlockingDeque
  • LinkedTransferQueue
  • CopyOnWriteArrayList
  • CopyOnWriteArraySet
  • ConcurrentSkipListSet
  • ConcurrentHashMap
  • ConcurrentSkipListMap

转载请注明来源,欢迎对文章中的引用来源进行考证,欢迎指出任何有错误或不够清晰的表达。可以在下面评论区评论,也可以邮件至 951488791@qq.com

文章标题:JDK_The Collections Framework

字数:4.1k

本文作者:zhengyumin

发布时间:2019-07-29, 15:26:01

最后更新:2020-01-12, 23:08:17

原始链接:http://zyumin.github.io/2019/07/29/JDK-The-Collections-Framework/

版权声明: "署名-非商用-相同方式共享 4.0" 转载请保留原文链接及作者。