[ create a new paste ] login | about

Link: http://codepad.org/i9pnQpUZ    [ raw code | output | fork | 1 comment ]

mohit_at_codepad - Haskell, pasted on Mar 13:
-- Written by Mohit Jain

-- Problem statement: Read input and insert comma between each word
-- Each word is separated by a space
-- Anything inside quotation is treated as single word

module Main where

import Data.List
cn = "This is a test for 'skip quotation and replace space with comma' program in Haskell"
main   = putStr .unlines .map splitWords . lines $cn

splitWords    :: String -> String 
splitWords    = f where
    -- words splits the string separated by one
    -- or more white spaces into String List
    -- This output is fed to mergeQuotes
    -- mergeQuotes would merge the string between quotation
    -- to a single string.
    f  = intercalate "," . mergeQuotes . words

-- Supported qutation characters
singleQuoteCharacter = '\''
doubleQuoteCharacter = '"'

-- FIXME:: Does not handle the case when quotation is not corener character (first or last)
-- FIXME:: Multiple spaces inside quotation are converted into single space character
mergeQuotes   :: [String] -> [String]
-- Merge all quouted strings into 1 string
-- Input and output are list of strings.
-- Length of output list is always less than or equal to input list length
-- If there was a space between quote and word in input, it is not treated as quote start
-- If you want to treat it as quotation start, please pre-process x or modify mergeEnclosing
mergeQuotes x = mergeSingleQuotes $ mergeDoubleQuotes x

mergeSingleQuotes :: [String] -> [String]
-- Merge all single quotations
-- Input and output are list of strings.
-- Length of output list is always less than or equal to input list lenght
mergeSingleQuotes x = mergeEnclosing singleQuoteCharacter x

mergeDoubleQuotes :: [String] -> [String]
-- Merge all double quotations
-- Input and output are list of strings.
-- Length of output list is always less than or equal to input list lenght
mergeDoubleQuotes x = mergeEnclosing doubleQuoteCharacter x

mergeEnclosing  :: Char -> [String] -> [String]
-- Merge all the strings enclosed by some character
-- The character for start and end enclose must be same
-- Enclosing character must not appear as single char string in list
mergeEnclosing _ []          = []
mergeEnclosing _ (a:[])      = [a]
mergeEnclosing c all@(a:as)  = finalArr where
       isQuoted = firstCharMatchesTo c a
       lastCharDoesNotMatchTo x y = not $ lastCharMatchesTo x y
       -- firstElement = if' isQuoted mergedResult a
       firstElement = if isQuoted then mergedResult else a
       mergedResult = unwords mergedArray
       mergedArray  = takeWhile (lastCharDoesNotMatchTo c) all ++ [head afterMergedArray]
       afterMergedArray = dropWhile (lastCharDoesNotMatchTo c) all
       remainingArr = if isQuoted then tail afterMergedArray else as
       finalArr     = firstElement : mergeEnclosing c remainingArr

firstCharMatchesTo  ::  Char -> String -> Bool
-- Checks if first character of String is same as Char
-- firstCharMatchesTo '"' "'test" is False
-- firstCharMatchesTo '"' "\"test" is True
firstCharMatchesTo _ [] = False
firstCharMatchesTo firstChar st = head st == firstChar

lastCharMatchesTo  ::  Char -> String -> Bool
-- Checks if last character of String is same as Char
-- lastCharMatchesTo '"' "test'" is False
-- lastCharMatchesTo '"' "test\"" is True
lastCharMatchesTo _ [] = False
lastCharMatchesTo lastChar st = last st == lastChar

-- And again I got an error <ERROR line 14 - Undefined variable "intercalate">
-- This time I need to implement my own intercalate
intercalate x = concat . intersperse x


Output:
1
This,is,a,test,for,'skip quotation and replace space with comma',program,in,Haskell


Create a new paste based on this one


Comments:
posted by mohit_at_codepad on Jun 12
Line number 34 can be better rewritten as:
mergeQuotes= mergeSingleQuotes . mergeDoubleQuotes

In line 40 and 46 also you can remove argument as:
mergeSingleQuotes = mergeEnclosing singleQuoteCharacter

You won't need to implement your own intercalate on most of the compilers.
Still if you implement, rewrite line 81 as:
intercalate = concat . intersperse

reply