In a recent message I referenced some changes I made to this function and included a copy of it. I mentioned that I wanted to refactor it after getting the behavior I wanted because it’s rather unreadable. So let’s see how I would refactor this:
Function PS_GetFileHashes(aFiles As Variant, sHashAlgorithm As String) As Scripting.Dictionary
Const PS_Cmd_Beg As String = "Get-ChildItem -Path "
Const PS_Cmd_Mid As String = " -File | Get-FileHash -Algorithm "
Const PS_Cmd_End As String = " | ForEach { $_.Path + '|' + $_.Hash }"
Dim t As New HAL_TickCounter: t.Init
Dim LenCmd As Long, aIdx As Long
t.Elapsed "GetFileHashes Begin"
Dim retVal As New Scripting.Dictionary
Dim sPSCmd As String, itm As Variant, sPathsPart As String, sPreviousPart As String, aOutput As Variant, itmParts As Variant
On Error GoTo Error_Handler
Select Case sHashAlgorithm
Case "MACTripleDES", "MD5", "RIPEMD160", "SHA1", "SHA256", "SHA384", "SHA512"
Case Else
MsgBox "Unknown Algorithm", vbCritical Or vbOKOnly, "Operation Aborted"
GoTo Error_Handler_Exit
End Select
LenCmd = Len(PS_Cmd_Beg & PS_Cmd_Mid & sHashAlgorithm & PS_Cmd_End)
' Need to split aFiles into manageable segments. 515 files overloaded the PS_GetOutput in this case
' Maybe get every 50 - 100 files at a time.
' Looks like 32656 characters is near the limit, 32766 is over the limit 32768 is
aIdx = LBound(aFiles)
Do While aIdx <= UBound(aFiles)
sPathsPart = ""
Do
sPreviousPart = sPathsPart
sPathsPart = sPathsPart & HALutil.WrapText(aFiles(aIdx), "'") & ","
aIdx = aIdx + 1
If aIdx >= UBound(aFiles) Then Exit Do
Loop Until Len(sPathsPart) - 1 + LenCmd > 32656
' We left because either we reached the end of the array or we exceeded the limit.
' If we exceeded the limit then we need to back one element off of sPathsPart and the aIdx
If Len(sPathsPart) - 1 + LenCmd > 32656 Then
aIdx = aIdx - 1
sPathsPart = sPreviousPart
End If
'Remove the comma from the end if there, if empty, use *.*
If Len(sPathsPart) > 0 Then sPathsPart = Left(sPathsPart, Len(sPathsPart) - 1) Else sPathsPart = "'*.*'"
' Build the PowerShell command string
sPSCmd = PS_Cmd_Beg & sPathsPart & PS_Cmd_Mid & sHashAlgorithm & PS_Cmd_End
t.Elapsed "GetFileHashes Start GetOutput"
aOutput = Split(CStr(PS_GetOutput(sPSCmd)), vbCrLf)
t.Elapsed "GetFileHashes End GetOutput"
' Add the output for filenames / hashes to the dictionary to return
For Each itm In aOutput
itmParts = Split(itm, "|")
If UBound(itmParts) > 0 Then retVal.Add itmParts(0), itmParts(1)
Next itm
Loop
t.Elapsed "GetFileHashes End"
Set PS_GetFileHashes = retVal
Error_Handler_Exit:
On Error Resume Next
Exit Function
'Get-ChildItem -Path '.\*.xls','.\*.csv' -Recurse -File | Get-FileHash -Algorithm SHA256
Error_Handler:
MsgBox "The following error has occured" & vbCrLf & vbCrLf & _
"Error Number: " & Err.Number & vbCrLf & _
"Error Source: PS_GetFileHash" & vbCrLf & _
"Error Description: " & Err.Description & _
Switch(Erl = 0, "", Erl <> 0, vbCrLf & "Line No: " & Erl) _
, vbOKOnly + vbCritical, "An Error has Occured!"
Resume Next
Resume Error_Handler_Exit
End Function
So the first thing I don’t like is that I threw 3 constants in at the beginning to make it easier to create a Powershell command later. So I’m going to move those constants to a new function along with the code and simply refer to the function when I need it. I started by simply creating the function I’m going to use, but it’s not doing anything yet:
Function BuildFileHashesCommand() As String
Dim retVal As String: retVal = ""
BuildFileHashesCommand = retVal
End Function
Now I’ll copy and paste the constants in, and then copy and paste the code using the constants in:
Function BuildFileHashesCommand() As String
Dim retVal As String: retVal = ""
Const PS_Cmd_Beg As String = "Get-ChildItem -Path "
Const PS_Cmd_Mid As String = " -File | Get-FileHash -Algorithm "
Const PS_Cmd_End As String = " | ForEach { $_.Path + '|' + $_.Hash }"
sPSCmd = PS_Cmd_Beg & sPathsPart & PS_Cmd_Mid & sHashAlgorithm & PS_Cmd_End
BuildFileHashesCommand = retVal
End Function
Ok, now I have some undefined variables in there. sPSCmd can be changed to retVal. And then I need the sPathsPart and sHashAlgorithm variables to be added as parameters to the function.
Function BuildFileHashesCommand(sPathsPart as String, sHashAlgorithm As String) As String
Dim retVal As String: retVal = ""
Const PS_Cmd_Beg As String = "Get-ChildItem -Path "
Const PS_Cmd_Mid As String = " -File | Get-FileHash -Algorithm "
Const PS_Cmd_End As String = " | ForEach { $_.Path + '|' + $_.Hash }"
retVal = PS_Cmd_Beg & sPathsPart & PS_Cmd_Mid & sHashAlgorithm & PS_Cmd_End
BuildFileHashesCommand = retVal
End Function
Perfect, now I can do a Debug->Compile. This reveals that I am using the constants in other places in the function as well. One way to resolve this would be to simply move the constants to a global scope so that both functions will be able to use them. The only bad thing about that in my case would be that it moves the constants far away from the code using them as my functions are in a larger library. However, in this case, I’ll just move them. The other options would be to define the constants multiple times (NO!!! That’s not good people), create a class which would contain the constants privately with the public methods needed that will do the things I need to do. Ok, I’ve talked myself into it. Here’s the new class file using Insert->Class Module and I’m saving it as PS_FileHashesCommand:
Option Compare Database
Option Explicit
Const PS_Cmd_Beg As String = "Get-ChildItem -Path "
Const PS_Cmd_Mid As String = " -File | Get-FileHash -Algorithm "
Const PS_Cmd_End As String = " | ForEach { $_.Path + '|' + $_.Hash }"
Public Function BuildFileHashesCommand(sPathsPart As String, sHashAlgorithm As String) As String
Dim retVal As String: retVal = ""
retVal = PS_Cmd_Beg & sPathsPart & PS_Cmd_Mid & sHashAlgorithm & PS_Cmd_End
BuildFileHashesCommand = retVal
End Function
Public Function LenCmdWithoutPaths(sHashAlgorithm) As Long
End Function
And now I will copy and paste the other line that uses the constants into the new Public Function I’ve created:
'PS_FileHashesCommand
'...
Public Function LenCmdWithoutPaths(sHashAlgorithm) As Long
Dim retVal As Long: retVal = 0
retVal = Len(PS_Cmd_Beg & PS_Cmd_Mid & sHashAlgorithm & PS_Cmd_End)
LenCmdWithoutPaths = retVal
End Function
Ok, now I can update the original function to use the new object and new functions and it will look like so:
Function PS_GetFileHashes(aFiles As Variant, sHashAlgorithm As String) As Scripting.Dictionary
Dim LenCmd As Long, aIdx As Long
Dim retVal As New Scripting.Dictionary
Dim sPSCmd As String, itm As Variant, sPathsPart As String, sPreviousPart As String, aOutput As Variant, itmParts As Variant
On Error GoTo Error_Handler
Select Case sHashAlgorithm
Case "MACTripleDES", "MD5", "RIPEMD160", "SHA1", "SHA256", "SHA384", "SHA512"
Case Else
MsgBox "Unknown Algorithm", vbCritical Or vbOKOnly, "Operation Aborted"
GoTo Error_Handler_Exit
End Select
Dim PScmd As New PS_FileHashesCommand
LenCmd = PScmd.LenCmdWithoutPaths(sHashAlgorithm)
' Need to split aFiles into manageable segments. 515 files overloaded the PS_GetOutput in this case
' Maybe get every 50 - 100 files at a time.
' Looks like 32656 characters is near the limit, 32766 is over the limit 32768 is
aIdx = LBound(aFiles)
Do While aIdx <= UBound(aFiles)
sPathsPart = ""
Do
sPreviousPart = sPathsPart
sPathsPart = sPathsPart & HALutil.WrapText(aFiles(aIdx), "'") & ","
aIdx = aIdx + 1
If aIdx >= UBound(aFiles) Then Exit Do
Loop Until Len(sPathsPart) - 1 + LenCmd > 32656
' We left because either we reached the end of the array or we exceeded the limit.
' If we exceeded the limit then we need to back one element off of sPathsPart and the aIdx
If Len(sPathsPart) - 1 + LenCmd > 32656 Then
aIdx = aIdx - 1
sPathsPart = sPreviousPart
End If
'Remove the comma from the end if there, if empty, use *.*
If Len(sPathsPart) > 0 Then sPathsPart = Left(sPathsPart, Len(sPathsPart) - 1) Else sPathsPart = "'*.*'"
' Build the PowerShell command string
sPSCmd = PScmd.BuildFileHashesCommand(sPathsPart, sHashAlgorithm)
aOutput = Split(CStr(PS_GetOutput(sPSCmd)), vbCrLf)
' Add the output for filenames / hashes to the dictionary to return
For Each itm In aOutput
itmParts = Split(itm, "|")
If UBound(itmParts) > 0 Then retVal.Add itmParts(0), itmParts(1)
Next itm
Loop
Set PS_GetFileHashes = retVal
Error_Handler_Exit:
On Error Resume Next
Exit Function
'Get-ChildItem -Path '.\*.xls','.\*.csv' -Recurse -File | Get-FileHash -Algorithm SHA256
Error_Handler:
MsgBox "The following error has occured" & vbCrLf & vbCrLf & _
"Error Number: " & Err.Number & vbCrLf & _
"Error Source: PS_GetFileHash" & vbCrLf & _
"Error Description: " & Err.Description & _
Switch(Erl = 0, "", Erl <> 0, vbCrLf & "Line No: " & Erl) _
, vbOKOnly + vbCritical, "An Error has Occured!"
Resume Next
Resume Error_Handler_Exit
End Function
This is slightly improved, but there’s a long way to go to make it readable. I’ll continue tomorrow.