Modules_Haskell Notes 2

1. Import

The syntax format for importing modules is:

-- 把模块中所有函数加入全局命名空间
import <module>
--  部分引用
import <module> (fn1, fn2)
-- 引入数据类型及其值构造器
import <module> (Datatype(constructor, constructor))
-- 引入所有值构造器
import <module> (Datatype(..))
--  hiding 排除
import <module> hiding (fn)
--  保留命名空间
import qualified <module>
--  保留命名空间，并起别名
import qualified <module> as <alias>

For example:

import Data.List
import Data.List (nub, sort)
import Data.Tree (Tree(Node, Branch))
import Data.Tree (Tree(..))
import Data.List hiding (nub)
import qualified Data.Map
import qualified Data.Map as M

The hiding syntax can alleviate naming conflict problems, but it's not very convenient. For modules with a large number of naming conflicts, you can use qualified to preserve namespaces to avoid conflicts

GHCi Environment

Import modules through the :m command:

> :m Data.List
> :m Data.List Data.Map Data.Set

After GHC 7.0, supports directly using import syntax in GHCi environment:

> import qualified Data.Map as M
> M.fromList [('a', 1)]
fromList [('a',1)]

So, no need to worry about environment differences, see details in import qualified in GHCI

2. Declaration

Modules are used to organize code, for example, putting functions with similar functionality into the same module

For example, binary tree module definition:

module BTree
-- 声明要暴露出去的函数及数据类型
( Tree
, singleton
, add
, fromList
, find
) where
-- 引入依赖模块
-- 定义数据类型及函数
data Tree a = EmptyTree | Node a (Tree a) (Tree a) deriving (Show, Read, Eq)
singleton x = Node x EmptyTree EmptyTree

Note:

Module name must match filename, so corresponding filename should be BTree.hs
Module declaration must be on the first line (cannot have import or similar things before, import can be placed after where)

Exporting data structures in modules is similar to import syntax:

module MyModule (Tree(Branch, Leaf)) where

data Tree a = Branch {left, right :: Tree a} | Leaf a

Only exposes data structure Tree and its constructors Branch and Leaf, can also use .. to expose all value constructors:

module MyModule (Tree(..))

Or don't expose value constructors, only allow obtaining values of this type through factory methods etc. (common example like Map.fromList):

module MyModule (Tree, factory)

The disadvantage is, this makes it impossible to use value constructors for pattern matching

Submodules

Modules have tree hierarchical structure, modules can have submodules, submodules can have submodules...

Has requirements for directory structure and naming, for example:

.
├── main.hs
└── Math
    ├── Number.hs
    └── Vector.hs

Package name requires first letter uppercase (Math), submodule filename must match submodule name, case sensitivity is environment-dependent (e.g., OSX is not sensitive)

3. Standard Library Modules

Standard library has many powerful built-in functions, can check usage examples, type declarations, even source code through Hoogle, very convenient

Data.List

Provides many List operation functions, commonly used ones like map, filter, and:

Predicates:

-- every, True only if all are True
and :: Foldable t => t Bool -> Bool
-- some, True if one is True
or :: Foldable t => t Bool -> Bool
-- commonly used some, True if any element in List satisfies condition
any :: Foldable t => (a -> Bool) -> t a -> Bool
-- commonly used every, True only if all elements in List satisfy condition
all :: Foldable t => (a -> Bool) -> t a -> Bool

Constructing new Lists:

-- insert separator element in array
intersperse :: a -> [a] -> [a]
-- similar to intersperse, insert one-dimensional array as separator in two-dimensional array, then flatten to one-dimensional
intercalate :: [a] -> [[a]] -> [a]
-- two-dimensional array row/column transpose
transpose :: [[a]] -> [[a]]
-- reduce dimension (connect a group of Lists into one List)
concat :: Foldable t => t [a] -> [a]
-- map then reduce dimension, equivalent to concat . map
concatMap :: Foldable t => (a -> [b]) -> t a -> [b]
-- infinite recursion, pass return value back in
iterate :: (a -> a) -> a -> [a]
-- split by position, return two split parts
splitAt :: Int -> [a] -> ([a], [a])
-- take elements until condition is not satisfied
takeWhile :: (a -> Bool) -> [a] -> [a]
-- drop elements until condition is not satisfied
dropWhile :: (a -> Bool) -> [a] -> [a]
-- split by condition (first position where condition is not satisfied), similar to takeWhile
span :: (a -> Bool) -> [a] -> ([a], [a])
-- split by condition (first position where condition is satisfied)
break :: (a -> Bool) -> [a] -> ([a], [a])
-- recursive init, until List is empty
inits :: [a] -> [[a]]
-- recursive tail, until List is empty
tails :: [a] -> [[a]]

Sorting:

-- merge sort
sort :: Ord a => [a] -> [a]
-- insert before first element in List that is greater than or equal to the element
insert :: Ord a => a -> [a] -> [a]

Grouping:

-- group, based on adjacent and equal values
group :: Eq a => [a] -> [[a]]
-- group by condition, one group satisfies, one group doesn't satisfy
partition :: (a -> Bool) -> [a] -> ([a], [a])

Matching:

-- substring matching (subList matching), whether contains specified substring
isInfixOf :: Eq a => [a] -> [a] -> Bool
-- substring matching, whether starts with specified substring
isPrefixOf :: Eq a => [a] -> [a] -> Bool
-- substring matching, whether ends with specified substring
isSuffixOf :: Eq a => [a] -> [a] -> Bool
-- element inclusion check, whether contains specified element
elem :: (Foldable t, Eq a) => a -> t a -> Bool
-- element inclusion check, whether does not contain specified element
notElem :: (Foldable t, Eq a) => a -> t a -> Bool

Searching:

-- search by condition, return first element satisfying condition
find :: Foldable t => (a -> Bool) -> t a -> Maybe a
-- search, return first matching element index or Nothing
elemIndex :: Eq a => a -> [a] -> Maybe Int
-- search all
elemIndices :: Eq a => a -> [a] -> [Int]
-- similar to find, but returns first satisfying element index
findIndex :: (a -> Bool) -> [a] -> Maybe Int
-- similar to find, but returns indices of all satisfying items
findIndices :: (a -> Bool) -> [a] -> [Int]

Combining:

-- combine Lists, also zip3 ~ zip7
zip :: [a] -> [b] -> [(a, b)]
-- combine Lists, and map once, also zipWith3 ~ zipWith7
zipWith :: (a -> b -> c) -> [a] -> [b] -> [c]

Text processing:

-- split string by lines (\n)
lines :: String -> [String]
-- join newlines (\n)
unlines :: [String] -> String
-- split by whitespace
words :: String -> [String]
-- join spaces
unwords :: [String] -> String

Removing elements:

-- remove duplicates
nub :: Eq a => [a] -> [a]
-- delete first matching element
delete :: Eq a => a -> [a] -> [a]

Set operations:

-- find difference set, if duplicate elements, only delete first
(\\) :: Eq a => [a] -> [a] -> [a]
-- find union set
union :: Eq a => [a] -> [a] -> [a]
-- find intersection set
intersect :: Eq a => [a] -> [a] -> [a]

More Generic Versions

Functions like length, take, drop, splitAt, !!, replicate have requirements for Int type parameters or return values, not generic enough, therefore provided more generic corresponding versions:

genericLength :: Num i => [a] -> i
genericTake :: Integral i => i -> [a] -> [a]
genericDrop :: Integral i => i -> [a] -> [a]
genericSplitAt :: Integral i => i -> [a] -> ([a], [a])
genericIndex :: Integral i => [a] -> i -> a
genericReplicate :: Integral i => i -> a -> [a]

nub, delete, union, intsect, group, sort, insert, maximum, minimum all judge equality through ==, also provided more generic versions allowing self-judging equality:

nubBy :: (a -> a -> Bool) -> [a] -> [a]
deleteBy :: (a -> a -> Bool) -> a -> [a] -> [a]
unionBy :: (a -> a -> Bool) -> [a] -> [a] -> [a]
intersectBy :: (a -> a -> Bool) -> [a] -> [a] -> [a]
groupBy :: (a -> a -> Bool) -> [a] -> [[a]]
sortBy :: (a -> a -> Ordering) -> [a] -> [a]
insertBy :: (a -> a -> Ordering) -> a -> [a] -> [a]
maximumBy :: Foldable t => (a -> a -> Ordering) -> t a -> a
minimumBy :: Foldable t => (a -> a -> Ordering) -> t a -> a

Functions with By are usually used together with Data.Function.on:

on :: (b -> b -> c) -> (a -> b) -> a -> a -> c
f `on` g = \x y -> f (g x) (g y)

For example, to express grouping adjacent elements by positive/negative:

groupBy ((==) `Data.Function.on` (> 0)) values

Semantics is very clear: classify elements by whether they are greater than zero

Additionally, sort is equivalent to sortBy compare (default comparison method is compare), to sort by List length, do this:

sortBy (compare `on` length) xs

Semantics is also very clear. So

(==) `on`
compare `on`

are both very great idiomatic patterns

P.S. Can use :browse <module> command to view type declarations of all functions and data types in module

Data.Char

String is actually [Char]:

type String = [Char] 	-- Defined in 'GHC.Base'

So when processing strings, often use Data.Char module, provides many character-related functions

Determining character range:

--  control character
isControl :: Char -> Bool
--  whitespace
isSpace :: Char -> Bool
-- lowercase Unicode character
isLower :: Char -> Bool
-- uppercase Unicode character
isUpper :: Char -> Bool
-- letter
isAlpha :: Char -> Bool
-- letter or digit
isAlphaNum :: Char -> Bool
-- printable character
isPrint :: Char -> Bool
-- ASCII digit, 0-9
isDigit :: Char -> Bool
-- octal digit
isOctDigit :: Char -> Bool
-- hexadecimal digit
isHexDigit :: Char -> Bool
-- letter, functionally equivalent to isAlpha, different implementation
isLetter :: Char -> Bool
-- Unicode diacritical mark, e.g., French
isMark :: Char -> Bool
-- Unicode number, including Roman numerals etc.
isNumber :: Char -> Bool
-- punctuation
isPunctuation :: Char -> Bool
-- currency symbol
isSymbol :: Char -> Bool
-- Unicode space or separator
isSeparator :: Char -> Bool
-- ASCII character (first 128 positions of Unicode alphabet)
isAscii :: Char -> Bool
-- first 256 positions of Unicode alphabet
isLatin1 :: Char -> Bool
-- uppercase ASCII character
isAsciiUpper :: Char -> Bool
-- lowercase ASCII character
isAsciiLower :: Char -> Bool

Determining category:

generalCategory :: Char -> GeneralCategory

Returned GeneralCategory is an enum, total 30 categories, for example:

> generalCategory 'a'
LowercaseLetter
> generalCategory ' '
Space

Character conversion:

-- convert to uppercase
toUpper :: Char -> Char
-- convert to lowercase
toLower :: Char -> Char
-- convert to title form, similar to toUpper, some ligatures have differences
toTitle :: Char -> Char
-- character to digit, requires [0-9,a-f,A-F]
digitToInt :: Char -> Int
-- digit to character
intToDigit :: Int -> Char
-- character to Unicode code
ord :: Char -> Int
-- Unicode code to character
chr :: Int -> Char

So, to implement simple encryption/decryption can do this:

encode shift = map $ chr . (+ shift) . ord
decode shift = map $ chr . (subtract shift) . ord
-- or more skillful
decode shift = encode $ negate shift

Data.Map

Dictionary is unordered list of key-value pairs, stored as balanced binary tree, Data.Map provides some dictionary processing functions

P.S. Some functions in Data.Map have naming conflicts with Prelude and Data.List modules, so use qualified import as to preserve namespace and create alias:

import qualified Data.Map as Map

Constructing new Map:

-- List to Map, if duplicate keys, take last value
Map.fromList :: Ord k => [(k, a)] -> Map.Map k a
-- Map to List
Map.toList :: Map.Map k a -> [(k, a)]
-- similar to fromList, doesn't directly discard duplicate keys, allows manual handling
Map.fromListWith :: Ord k => (a -> a -> a) -> [(k, a)] -> Map.Map k a
-- empty Map
Map.empty :: Map.Map k a
-- insert (k, a), return new Map
Map.insert :: Ord k => k -> a -> Map.Map k a -> Map.Map k a
-- similar to insert, allows handling duplicate keys
Map.insertWith :: Ord k => (a -> a -> a) -> k -> a -> Map.Map k a -> Map.Map k a
-- single element Map
Map.singleton :: k -> a -> Map.Map k a
-- map to new Map
Map.map :: (a -> b) -> Map.Map k a -> Map.Map k b
-- filter new Map
Map.filter :: (a -> Bool) -> Map.Map k a -> Map.Map k a

Getting Map information:

-- check empty
Map.null :: Map.Map k a -> Bool
-- length
Map.size :: Map.Map k a -> Int
-- get all keys
Map.keys :: Map.Map k a -> [k]
-- get all values
Map.elems :: Map.Map k a -> [a]

Searching:

-- search by key
Map.lookup :: Ord k => k -> Map.Map k a -> Maybe a
-- inclusion judgment
Map.member :: Ord k => k -> Map.Map k a -> Bool

Data.Set

Provides set-related utility functions, structurally similar to Map, both stored as tree structure

P.S. Similarly, also has many naming conflicts, needs qualified import:

import qualified Data.Set as Set

Constructing sets:

-- List to Set
Set.fromList :: Ord a => [a] -> Set.Set a

Set operations:

-- find intersection
Set.intersection :: Ord a => Set.Set a -> Set.Set a -> Set.Set a
-- find difference
Set.difference :: Ord a => Set.Set a -> Set.Set a -> Set.Set a
-- find union
Set.union :: Ord a => Set.Set a -> Set.Set a -> Set.Set a
-- judge subset
Set.isSubsetOf :: Ord a => Set.Set a -> Set.Set a -> Bool
-- judge proper subset
Set.isProperSubsetOf :: Ord a => Set.Set a -> Set.Set a -> Bool

Note, function names are mischievous, array's List.intersect becomes Set.intersection in sets

Many functions in Map also have corresponding versions in Set, such as null, size, member, empty, singleton, insert, delete, map, filter etc.

Similarly, sets can be used to implement one-line deduplication:

unique :: Ord a => [a] -> [a]
unique = Set.toList . Set.fromList

Set deduplication efficiency is higher than List.nub, but disadvantage is constructing set will sort elements, so obtained deduplication result doesn't preserve original order (List.nub preserves)

References

Haskell/Modules
Haskell data type pattern matching: Pattern matching custom data types